Archive for 2008

ABC Meme

Monday, November 24th, 2008

Instructions: type the letter ‘a’ in your browser location bar and choose the first match from the dropdown. Repeat for each letter of the alphabet.

Browser: Firefox 3.1 beta

a: Air Mozilla
b: /buildbot/steps/source.py – Buildbot – Trac
c: mozilla mozilla/configure.in
d: Digg / News
e: Enter Bug: Core
f: First National Bank of PA Personal Banking Services
g: Google Quicksearch: g
h: hghooks: Summary
i: intranet
j: mozilla mozilla/js/src/jsapi.h
k: Build Log (Brief) – Win2k3 comm-central dep unit test on 2008/10/30 15:57:27
l: Lilypond program-reference
m: tinderbox: MozillaTry
n: Google News
o: os.path – Common pathname manipulations
p: mozilla-central: pushlog
q: The irc.mozilla.org QDB: Welcome
r: mozilla-central mozilla/config/rules.mk
s: Slashdot – News for nerds, stuff that matters
t: BSBlog > Blog Archive > The Testing Matrix
u: United Airlines – Airline Tickets, Airline Reservations, Flight Airfare
v: Lilypond program-reference
w: washingtonpost.com – nation, world, technology and Washington area news and headlines
x: XPCOM Glue – MDC
y: The New York Times – Breaking News, World News & Multimedia
z: Mozilla Dehydra Cross Reference

Recognizing the Other

Thursday, November 20th, 2008

I’ve been trying to identify why this presidential election campaign has been so depressing to me. I didn’t particular like either of the candidates or the parties behind them. But in looking at the debates and the candidates campaign websites, what is most distressing is that the candidates, and the country, don’t ever seem to acknowledge the good points of the other side. This is what I wish the candidates would have said:

McCain

  • Pork-barrel spending is an extremely important issue, not because it represents a significant percentage of our national goverment’s spending, but because of the attitude it represents: each state or district in the nation fighting against the others for government handouts; not of spending wisely for the common good, or building a national infrastructure, but of making states and districts dependent on the national government.
  • The duty of the national government is to ensure the common good, and not the individual good. The taxes imposed by the national government should be applied primarily for the purposes of improving the infrastructure of the country at home and defending it abroad. As long as people are not starving or homeless, income equality is not something that should be solved by government intervention, because the risks to business and free enterprise are too great.
  • I want people to have good health and a financially secure retirement: but this is not the place of the federal government; the dangers and inefficiencies of government are too great, and the restrictions it would place on individual liberty are too severe. State and local governments are perfectly capable of performing these functions, and should be allowed to do so if the citizens decide it’s a good idea. A national health insurance mandate will do little to reduce the actual cost of health care. It will spread the cost among more people, and be temporarily cheaper, but the inevitable result will be that the national goverment is put in the position of either rationing health care, bankrupting business, or spending more than it can afford. We should instead focus on making everyday health care available and affordable without insurance coverage: by reforming malpractice tort laws and by working with existing health insurance providers to reform the incentives for specialist treatment that leave basic care underprovided and underfunded.
  • We have a responsibility to the safety of our own country as well as the people of Iraq. If we pull out of Iraq now, it is very likely to return to a state of insurrection and eventually dictatorship. Whether or not the Iraq war was a noble idea when it was launched, the honor of our country depends on Iraq being a stable and productive country.

Obama

  • The duty of government is not only to build the physical infrastructure of bridges and roads, but also our human infrastructure. Universal health care is important for the general welfare of our nation. It would be better if states or municipalities could provide universal coverage, but they are not able to, because ___. This leaves the federal government as the only level of government capable of implementing a solution. My proposal of mandatory coverage by employers will not cause the federal government to ration or control individual health care, while still providing our citizens with affordable health care.
  • The expense of the war in Iraq is ruining our own country in a time of economic crisis. Whether or not you believe we should have entered the war, we are now at a point where we cannot help the Iraqi people further. The Iraqi people have achieved a stable government. There is nothing more we can do in Iraq that will make it better off. Leaving Iraq will not lead to chaos, insurrection, or anarchy.

DTrace Bugs on Mac

Thursday, November 20th, 2008

Ted and I have been looking rather closely at the performance of the Mozilla build system. In order to get a better sense of where we’re spending time, I wanted to use dtrace to get statistics on an entire build.

Basic Process Information From DTrace

In theory, the dtrace proc provider lets a system administrator watch process and thread creation for a tree of processes. Using normal dtrace globals, you can track the process parent, arguments, working directory, and other information:

/* progenyof($1) lets us trace any subprocess of a specific process, in this case the shell from
   which we launch the build */

proc:::create
/progenyof($1)/
{
  printf("FORKED\t%i\t%i\t%i\n", timestamp, pid, args[0]->pr_pid);
}

proc:::exec
/progenyof($1)/
{
  printf("EXEC\t%i\t%i\t%s\t%s\n", timestamp, pid, curpsinfo->ps_args, cwd);
}

proc:::exit
/progenyof($1)]
{
  printf("EXIT\t%i\t%i\n", timestamp, pid);
}

Unfortunately, the MacOS implementation of dtrace doesn’t reflect information very well:

  • curpsinfo->ps_args doesn’t contain the entire command-line of the process; it only contains the first word
  • cwd doesn’t contain the entire working directory /builds/mddepend/ff-debug but only the last component ff-debug. Since many of our directories within the tree share names such as src and public, the information is pretty much useless.

Process CPU Time in DTrace

Dtrace doesn’t give scripts a simple way to track the CPU time used by a process: the kernel psinfo_t struct does have a pr_time member, but this is of non-reflected struct timestruc_t.

There is another way to calculate this: dtrace exposes a variable vtimestamp which represents, for each thread, a virtual timestamp when that thread was executing. By subtracting the vtimestamp at proc:::lwp-start from the vtimestamp at proc:::lwp-exit you can calculate the time spent in each thread, and use sums to calculate the per-process total.

proc:::lwp-start
/progenyof($1)/
{
  self->start = vtimestamp;
}

proc:::lwp-exit
/self->start/
{
  @[pid] = sum(vtimestamp - self->start);
  self->start = 0;
}

END
{
  printf("%-12s %-20s\n", "PID", "TIME");
  printa("%-12i %@i\n", @);
}

Unfortunately, the MacOS implementation of DTrace has a serious bug in the implementation of proc:::lwp-start: it isn’t fired in the context of the thread that’s being started, but in the context of the thread (and process!) that created the thread. This means that the pid and vtimestamp reported in the probe are useless. I have filed this with Apple as radar 6386219.

Summary

Overall, the bugs in the Apple implementation of DTrace make it pretty much useless for doing the build system profiling I intended. I am now trying to get an OpenSolaris virtual machine up for building, since I know that DTrace is not broken on Solaris; but never having used Solaris before, I’ll save that story for another day.

Paginate the Web?

Tuesday, October 28th, 2008

Web pages scroll, usually vertically. Is this a good thing?

I was reading an article that Deb pointed out from The Atlantic: “Is Google Making Us Stupid?” and I noticed something: I could easily keep attention on the page when I wasn’t scrolling. But as soon as I got to the bottom of the page, it was much harder to stay focused.

What if web browsers paginated articles by default, instead of laying them out in a vertical scroll view by default? Would that improve reader attention span, or just cause users to stop reading after the first page?

Is it possible to write a Firefox extension to render websites as paginated entities instead of scrolling entities? I suspect not, and that would require assistance from the core Gecko layout engine, but I think it would be a very interesting UI experiment!

Unusual Town Names in Pennsylvania

Tuesday, October 21st, 2008

Moving to Pennsylvania a few years ago, I discovered that, unlike Virginia, not all places are named after English nobility or geographic features. I have collected some of the more amusing/unusual place names I discovered in Pennsylvania into a Google map.


View Larger Map

I love how the triangle of Paradise, Intercourse, and Fertility are so far removed from Climax.

Many coal towns in Western PA were named by the mining companies. Uninventive names like “Mine 71” are common. But I think the best is “Revloc”. This is the next town over from Colver, and whoever named it just decided to spell Colver backwards.

Cambria county doesn’t allow outdoor advertising of pornography; so immediately outside of the county on U.S. route 22 (near Climax PA) there are a group of video stores and strip clubs. One sign in particular has a rather amusing mis-use of quotation marks:

“Live” girls!!!

PUTting and DELETEing in python urllib2

Tuesday, October 21st, 2008

The urllib2 Python module makes it pretty simple to GET and POST data using HTTP (and other protocols). But there isn’t a good built-in way to issue HTTP PUT or DELETE requests. I ran into this limitation while working on a project to upload automatically generated documentation to the Mozilla Developer Center. The DekiWiki API for uploading an file attachment uses the HTTP PUT method.

It turns out there is an easy workaround. You can subclass the urllib2.Request class and explicitly override the method:

import urllib2

class RequestWithMethod(urllib2.Request):
  def __init__(self, method, *args, **kwargs):
    self._method = method
    urllib2.Request.__init__(*args, **kwargs)

  def get_method(self):
    return self._method

Preview for Thursday’s post: the generated documentation is already online.

What Do People Do All Day?

Thursday, October 9th, 2008

There are very few picture books which talk about money, and even fewer do it well. Richard Scarry’s What Do People Do All Day? is a notable and wonderful exception.

Scan from "What Do People Do All Day?": Farmer Alfalfa selling produce

Throughout the book, characters are creating value by farming, tailoring, or baking. They sell their goods for money, use it to pay for raw materials, buy gifts for their wives, and put the extra in the bank. When the tailor decides to build a new house, he hands a large sack of money to the builders. When the mayors of two towns decide to pave a road between them, they have several huge sacks of money for the road builders.

I recommend pretty much anything by Richard Scarry, but this is my personal favorite. If you have children under the age of ten, or just love picture books, look for it in your local library or bookstore.

Release branches in mozilla-central

Thursday, October 9th, 2008

On the Firefox development branches, the version number is always “pre”: for example, 3.1b1pre. This makes it easy to distinguish between nightly builds and release builds. To produce a release, the release team creates a “minibranch”. This minibranch exists for the following reasons:

  • To allow bumping the version numbers to the release version, for example: 3.1b1.
  • To isolate the release process and allow the main development tree to re-open as quickly as possible.

This is a long-standing tradition in CVS, but we haven’t really done it before 3.1b1 in Mercurial. This week, mozilla-central grew a new branch: the GECKO191b1_20081007_RELBRANCH. Pushing this branch to mozilla-central caused some unexpected side effects for developers:

  1. Developers who issued a normal hg pull -u got the following message:
    adding 171 changesets with 234 changes to 110 files (+1 heads)
    not updating, since new heads added
    (run 'hg heads' to see heads, 'hg merge' to merge)

    Yes, a new head was added; but this head is on a named branch and shouldn’t affect developers who aren’t on that branch. This is a bug in Mercurial that will be fixed in future versions. To work around the problem, just run hg up, which will update you to the latest revision of the default branch.

  2. hg heads shows branch heads. Normally, developers working on the default branch don’t care about heads on other branches, and don’t want release branch heads showing up when they issue the hg heads command. The Mercurial developers are aware of this issue and will fix it in a future version. In the meantime, use the following command to see only the heads of the default branch: hg heads default.

Note: even with the above bugs fixed, hg pull -u isn’t the exact equivalent of hg pull; hg up: in the case where no new changes are available on the remote server, no update will be performed. This only affects trees where the working directory is not at the tip revision. This slightly unintuitive behavior is considered a feature by the Mercurial developers, not a bug.

A Good Bookstore

Tuesday, October 7th, 2008

I was sad to read about the closing of Olsson’s Books and Records in Washington D.C. When I lived in D.C., I worked two blocks from the Olsson’s downtown; I would drop in frequently during lunch, and spent an impressive fraction of my salary there.

While the prices and selection at Olsson’s were both competitive, most impressive and enjoyable was the staff: it was clear that everyone in the store loved books. They were able to provide relevant recommendations to me, after I became a “regular”, remembered my tastes and interests and were really good salesmen.

I’m sure that the “new” Barnes and Noble on 12th street had something to do with Olsson’s decline. Olsson’s probably couldn’t compete with the extended hours and the café. I don’t want to malign the huge chains: I enjoy spending a couple hours in a mega-super-duper Barnes and Noble. I’ve noticed that when a new store opens, the staff is really helpful and knowledgeable. But after a year or so, the quality of staff at the information desk starts dropping, sometimes dramatically. How can you be a really helpful bookseller if you’ve never heard of St. Augustine or Orson Scott Card? Why can’t the mega-chains keep book-loving sales staff?

Parsing IDL in Python

Tuesday, October 7th, 2008

One of the current pain points in our build system is the xpidl compiler. This is a binary tool which is used to generate C++ headers and XPT files from XPIDL input files. The code depends on libIDL, which in turn depends on glib. Because this tool is a build-time requirement, we have to build a host version, and in most cases we also build a target version to put in the SDK package.

Getting glib and libidl on linux systems is not very difficult: all the major distros have developer packages for them. But getting libidl and glib on Windows and mac can be quite painful. On Windows, we have had to create our own custom static library versions of this code which are compatible with all the different versions of Microsoft Visual C++. On Mac you can get them from macports, but as far as I know they are not available in universal binaries, which means that you can’t cross-compile a target xpidl.

Parsing IDL, while not trivial, is not so complicated that it requires huge binary code libraries. So a while back I reimplemented the XPIDL parser using python and the PLY (python lex-yacc) parsing library. The core parsing grammar and object model is only 1200 lines of code.

Because we don’t have any unit tests for xpidl, I chose to use A-B testing against the output of the binary xpidl: the header output of the python xpidl should match byte-for-byte the header output of the binary xpidl. I wrote a myrules.mk file which would automatically build and compare both versions during a buld. This turned out to be a royal pain, because the libIDL parser is not very consistent and has bugs:

  • Some, but not all attributes are re-ordered so that they are no longer in original order, but are ordered according to an internal glib hash function.
  • The code which associates doc-comments with IDL items is buggy: in many cases the comment is associated with a later item.

I had to add some temporary nasty hacks in order to work around these issues. And finally, reproducing the wacky whitespace of the binary tool wasn’t worthwhile, so I starting comparing the results using diff -w -B. But with these hacks and changes, both xpidl compilers produce identical C++ headers.

I completed the code to produce C++ headers during a couple of not-quite-vacation days, but I didn’t write any code to produce XPT files. I shelved the project as an attractive waste of time, until jorendorff needed an IDL parser to produce quick stub C++ code. Jason managed to take my existing code and hook up a quick-stub generator to it. The python xpidl parser and quick-stub generator are both used in the codebase.

Currently, we’re still using the old binary xpidl to produce C++ headers and XPT files. If somebody is interested, I’d really like help adding code to produce XPT files from the new parser, so that we can ditch the old binary code completely.

If you ever need to use python to parse some interesting grammar, I highly recommend PLY. If you turn on optimization it performs very well, and it has very good support for detailed error reporting.