Archive for the 'untagged' Category

Lost in the Cosmos: The Last Self-Help Book

Thursday, April 2nd, 2009

Part of an occasional series of less-well-known books I love.

Lost in the Cosmos: The Last Self-Help Book by Walker Percy

Despite the subtitle, Lost in the Cosmos is only a mock self-help book. It doesn’t pretend to have answers, but instead has many questions, a multi-chapter diversion about the theory of symbols and language, and two science-fiction short stories.

In a recent issue of a home-and-garden magazine, an article listed fifty ways to make a coffee table.

One table was made of an old transom of stained glass supported by an antique brass chandelier cut ingeniously to make the legs.
Another was a cypress stump, waxed and highly polished.
Another was a big spool used for telephone cable set on end.
Another was a lobster trap.
Another was a Coca-Cola sign propped on Coke crates.
Another was a stone slab from an old morgue, the blood runnel used as an ash tray.
Another was a hayloft door set on cut-down sawhorses.
Another was the hatch of a sailboat mounted on halves of ships’ wheels.
Another was a cobbler’s bench.

Not a single one was a table designed as such, that is, a horizontal member with four legs.

Question: Why was not a single table designed as such rather than being a non-table doing duty as a table?

(a) Because people have gotten tired of ordinary tables.
(b) Because the fifty non-tables converted to use as tables make good conversation pieces.
(c) Because it is a chance to make use of valuable odds and ends which would otherwise gather dust in the attic.
(d) Because the self in the twentieth century is a voracious nought which expands like the feeding vacuole of an amoeba seeking to nourish and inform its own nothingness by ingesting new objects in the world but, like a vacuole, only succeeds in emptying them out.

(Check One)

The first half of the story is questions such as this, interspersed with hypothetical scenarios. Each question focuses on some aspect of who we are, both individually and collectively. Then, in an odd but effective twist, the book veers off into a chapter about “semiotics”: how we use symbols to communicate, how these symbols gain meaning, and how we use symbols differently than any other creature.

The end of Lost in the Cosmos contains two science-fiction short stories. The first story is particularly compelling because it deals with the fall of man: that man gained consciousness of himself and soon after was destroyed by that very consciousness. The alien race, on the other hand, has preternatural consciousness: they have avoided “the malformations of self consciousness”. The aliens refuse to let man land, because they are fallen and have not asked for help. The state of fallen mankind described in this story is different but has striking parallels to the story of the fall of Adam.

For a book which bounces quickly through language, transcendence, evolution, boredom, and sexuality, and pretends to classify everyone’s self-understanding in 16 categories, Lost in the Cosmos is a surprisingly coherent and enjoyable story. It is sometimes aggravating, especially when none of the multiple-choice answers comes close to your answer, but it will always provoke thought about how man knows himself, and how you personally know yourself.

Available at amazon.com.

If you’ve read Lost in the Cosmos and want to dive deeper, Walker Percy wrote a less-fiction on language: The Message in the Bottle: How Queer Man Is, How Queer Language Is, and What One Has to Do with the Other (also available at amazon.com).

Less-Known Books: Enchantment

Tuesday, March 3rd, 2009

Like many of my friends and coworkers, I love books. Often I’m surprised that my friends have never heard of a book I particularly love. I think there are enough of them that I could post every other Thursday for the next year: I hope my friends and enemies will post their favorite books I might not know about in return.

Enchantment, by Orson Scott Card

After all the fairy tales he had read and studied, the one possibility he had never entertained was this: That they might be true…

When twentieth-century scholar Ivan Smetski discovers and saves Sleeping Beauty from the spell of the evil witch Baba Yaga by proposing to her, their troubles have only just begun. Together, Katerina and Ivan must save her kingdom from the evil witch, who has enslaved the bear-god and uses his power.

Orson Scott Card is a master of character, and Enchantment shows the true power of his writing and creativity. We see the rich relationship between Ivan’s parents; the interdependence between the kingdom and the king in Katerina’s time; the truce between the Church and witchcraft; and the slow discovery of marriage between Ivan and Katerina themselves. The treatment of marriage is enlightening and joyous.

Card is rightfully famous for his science-fiction novels, especially the Ender series. But Enchantment is one of his best, and is certainly worth a trip to the library or amazon.com.

Performance and Pymake

Thursday, February 26th, 2009

pymake runs correctly now. With some Mozilla patches, I can get it to recurse through most of the tree within a single process, doing parallel builds correctly, on Windows, Linux, and Mac, python 2.4 and 2.5.

6x as slow

Performance is bad. On a benchmark of a do-nothing rebuild of make tier_xpcom, which is about 40 recursive makes:

bsmedberg $ time make tier_xpcom
real	0m1.129s

bsmedberg $ time python -O /builds/pymake/make.py tier_xpcom
real	0m7.525s

This is a bit depressing, because improved Windows performance was one of the primary reasons for doing pymake. It’s possible that the Windows numbers are better, given that launching Windows processes is so much more expensive than on Linux: but I doubt that will make up the whole 6x performance loss.

Improved 50% today!

Earlier today, the number was about 15 seconds. Profile-guided patches helped reduce execution time significantly!

The first optimization was string-joining performance. str.join Showed up near the top of profiles both in call counts and own-time. Expansion objects are a core data structure in pymake: the parser turns a string such as “Hello$(COMMA), $(subst a,o,warld)!” into a list of bare strings and function calls (a variable expansion is treated as a function. An expansion is “resolved” to a string by passing it variables and a makefile context. The original, naive way of resolving an Expansion used ''.join() on the elements. This was replaced, in two phases, with a system of iterators which recursively yield strings, and an itersplit method, which splits an expansion into words without even joining it into a single string. A final optimization replaced ''.join entirely: it was better, in 99% of cases, to use simple appending methods when few elements are being joined.

Another optimization avoids parsing makefile syntax into expansions until it’s actually needed. In many cases, makefiles will use a small set of variables many times, and will never read the value of other variables. The first candidate optimization had pymake parse variables as they were set; a much better solution later was to lazily parse variables the first time they were read.

A grab-bag of other optimizations improved performance by a bit, but the last attempt increased code complexity far more than performance.

Hitting a Performance Barrier

At the moment I think pymake has hit a performance barrier and I’m not sure how to proceed. The current profile of pymake, generated with cProfile, is mostly unhelpful:

7228961 function calls (6902795 primitive calls) in 10.934 CPU seconds

Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    15529    0.783    0.000    2.059    0.000 /builds/pymake/pymake/parser.py:689(parsemakesyntax)
49054/35478    0.555    0.000    2.320    0.000 /builds/pymake/pymake/util.py:24(itersplit)
      128    0.396    0.003    0.396    0.003 {posix.read}
466085/222572    0.356    0.000    2.653    0.000 /builds/pymake/pymake/data.py:192(resolve)
    51384    0.289    0.000    1.491    0.000 /builds/pymake/pymake/parserdata.py:214(execute)
    13876    0.288    0.000    1.007    0.000 /builds/pymake/pymake/data.py:684(resolvevpath)
    29268    0.280    0.000    0.280    0.000 {posix.stat}
   171027    0.266    0.000    0.471    0.000 /builds/pymake/pymake/data.py:430(match)
    25700    0.246    0.000    0.327    0.000 /builds/pymake/pymake/data.py:384(__init__)
    40350    0.223    0.000    0.223    0.000 /usr/lib64/python2.5/logging/__init__.py:1158(getEffectiveLevel)
    58982    0.213    0.000    0.329    0.000 /builds/pymake/pymake/data.py:312(set)
43854/42319    0.211    0.000    1.572    0.000 /builds/pymake/pymake/functions.py:56(resolve)
131959/117714    0.207    0.000    1.343    0.000 /builds/pymake/pymake/data.py:258(get)
     2130    0.194    0.000    2.281    0.001 /builds/pymake/pymake/data.py:542(resolveimplicitrule)
    47515    0.189    0.000    0.421    0.000 /builds/pymake/pymake/data.py:1204(gettarget)
      128    0.182    0.001    0.182    0.001 {posix.fork}
     7717    0.174    0.000    1.941    0.000 /builds/pymake/pymake/parserdata.py:117(execute)
    57298    0.173    0.000    0.255    0.000 /usr/lib64/python2.5/posixpath.py:56(join)
    73798    0.165    0.000    0.628    0.000 /builds/pymake/pymake/data.py:1103(matchesfor)
    46953    0.157    0.000    0.184    0.000 /builds/pymake/pymake/parser.py:176(iterdata)
1153401/1150418    0.156    0.000    0.158    0.000 {len}
    27900    0.156    0.000    0.163    0.000 /builds/pymake/pymake/data.py:67(__init__)
11264/168    0.148    0.000    6.120    0.036 /builds/pymake/pymake/parserdata.py:431(execute)
37008/23960    0.141    0.000    0.176    0.000 /builds/pymake/pymake/parser.py:193(itermakefilechars)
   330817    0.135    0.000    0.135    0.000 {method 'startswith' of 'str' objects}

parsemakesyntax, the function which parses $(FOO) $(BAR) into an Expansion, is still the single most time-consuming function. But since I don’t have line-by-line heatmaps, it’s hard to know what parts of that function might be inefficient. The callee data is not much help:

                                                          ncalls  tottime  cumtime
/builds/pymake/pymake/parser.py:689(parsemakesyntax)  ->   27666    0.017    0.017  /builds/pymake/pymake/data.py:111(__init__)
                                                             233    0.000    0.001  /builds/pymake/pymake/data.py:116(fromstring)
                                                           33408    0.059    0.097  /builds/pymake/pymake/data.py:145(appendstr)
                                                           10219    0.014    0.020  /builds/pymake/pymake/data.py:156(appendfunc)
                                                           27666    0.084    0.212  /builds/pymake/pymake/data.py:183(finish)
                                                            1474    0.002    0.002  /builds/pymake/pymake/functions.py:24(__init__)
                                                            1271    0.002    0.002  /builds/pymake/pymake/functions.py:32(setup)
                                                            2765    0.005    0.007  /builds/pymake/pymake/functions.py:40(append)
                                                            8315    0.018    0.022  /builds/pymake/pymake/functions.py:48(__init__)
                                                             430    0.001    0.001  /builds/pymake/pymake/functions.py:70(__init__)
                                                             203    0.001    0.002  /builds/pymake/pymake/functions.py:380(setup)
                                                           25800    0.106    0.346  /builds/pymake/pymake/parser.py:73(getloc)
                                                            9986    0.068    0.072  /builds/pymake/pymake/parser.py:97(findtoken)
                                                           27239    0.035    0.072  /builds/pymake/pymake/parser.py:155(get)
                                                           46953    0.157    0.184  /builds/pymake/pymake/parser.py:176(iterdata)
                                                           15245    0.071    0.133  /builds/pymake/pymake/parser.py:193(itermakefilechars)
                                                            6440    0.026    0.033  /builds/pymake/pymake/parser.py:247(itercommandchars)
                                                           25515    0.032    0.032  /builds/pymake/pymake/parser.py:673(__init__)
                                                           15529    0.003    0.003  {callable}
                                                           28565    0.008    0.010  {len}
                                                            9986    0.003    0.003  {method 'append' of 'list' objects}
                                                               1    0.000    0.000  {method 'iterkeys' of 'dict' objects}
                                                            9986    0.005    0.005  {method 'pop' of 'list' objects}

Yes, I know getloc is inefficient, and a commenter on one of my previous posts suggests a possible solution. But that’s not going to create any significant improvement. In order to have performance parity with GNU make there has to be an algorithmic improvement.

Can I trust cProfile?

There are some confusing aspects to the cProfile output which make me suspect it. In particular, I suspect that generator functions are not being accounted for correctly: the primary work of the iterdata function is to call match on a compiled regular expression object, but that method doesn’t even show up in the callee list:

                                                   ncalls  tottime  cumtime
/builds/pymake/pymake/parser.py:176(iterdata)  ->   17815    0.004    0.004  {built-in method end}
                                                    17815    0.004    0.004  {built-in method group}
                                                    35630    0.014    0.014  {built-in method start}
                                                    25222    0.005    0.005  {len}

In any case, it’s hard to usefully analyze the profiling output. What I want is a Shark-like hierarchical profile. Apparently, dtrace can profile Python code, but I seen any useful visualization/analysis tools for that combination: if anyone knows of something, please let me know!

Help Wanted

I’ve been head-down in pymake for a few weeks now; my review queue and several other Mozilla tasks need attention. I’d really love some help from people who know Python performance (or who have no fear). If you’re interested and want some guidance, please e-mail me or leave a comment here. We can probably construct some useful pymake performance tests that are not as scary as actually building Mozilla.

Sanity and Testcases for pymake

Monday, February 23rd, 2009

Testcases kept me sane writing (and rewriting) pymake. This shouldn’t be a surprise to experienced developers: most developers agree that that test-driven development is good. Often, however, beginning programmers don’t know how to start a project with adequate testing. This post attempts to describe the pymake test environment and give examples of pymake tests.

I started pymake with fear and trepidation. I’ve been working extensively with makefiles for 6 years; makefile parsing and execution still occasionally surprises me. This fear was a great motivator: if I had thought this to be an easy job, I might have skipped writing tests until much later in the process. But the testsuite has been absolutely essential: I doubt I could have completed initial development in two weeks without it, and there is no way I could have refactored the code to support in-process recursion and parallel make this week without it.

Start Small

The most important hurdle in a new project is creating a framework to run tests. The requirements for a test framework are pretty simple:

  • make it easy to write new tests;
  • make it easy to run the tests;
  • don’t waste time writing fancy test apparatus.

The specifics of your test framework will depend on your project. When possible, re-use existing frameworks, or at least borrow extensively from them. For pymake, I use two basic types of test: makefile tests and python unit tests.

Makefile Tests

Because the entire purpose of pymake is to parse and execute makefiles, pymake has a test harness for parsing and executing makefiles. This test harness runs make against a testcase makefile; parsing and executing the makefile should complete successfully (with a 0 exit code) and print TEST-PASS somewhere during execution. Typically, each makefile will test a single feature or a related set of features.

This test harness is particularly important because pymake is supposed to be a mostly drop-in replacement for GNU make. This test harness can be used to test both GNU make and pymake. The harness was committed in revision 1 of the pymake repository, long before pymake could parse makefiles. The first tests were tests of GNU make behavior, in cases where that behavior was under-documented or confusing. Before I started implementing the meat of the parser, I already had discovered several interesting behaviors and written tests for them.

tchaikovsky:/builds/pymake $ python tests/runtests.py # run the testsuite using GNU make
tchaikovsky:/builds/pymake $ python tests/runtests.py -m /builds/pymake/make.py # run the testsuite using make.py

As the project became basically functional, each new feature was committed with a test. See, for instance, a fix for parsing line continuations with conditional blocks.

Initially, the makefile test harness only checked for success. But an important part of most test suites is to check for proper error handling. runtests.py grew additional features to allow a makefile to specify that it should return a failure error code, and also to specify a command line. It also ran each test in a clean directory, to avoid unexpected interactions between tests.

Writing makefile testcases often required creativity. It’s often important to check that commands are executed in a specified order, or that a particular command is only executed once. One technique is to append output to a signal file while running commands, and then test the contents of the file (tests/diamond-deps.mk):

# If the dependency graph includes a diamond dependency, we should only remake
# once!

all: depA depB
	cat testfile
	test `cat testfile` = "data";
	@echo TEST-PASS

depA: testfile
depB: testfile

testfile:
	printf "data" >>$@

This same technique is also useful to make sure that parallel execution is enabled or disabled appropriately: tests/parallel-toserial.mk.

Python Unit Tests

In the early stages of pymake, only some portions of the data model and parser were implemented: there were lots of low-level functions for location-tracking, line continuations, and tokenizing. It was important to me that these low-level functions were rock-solid before I started attempting to glue them together.

The python standard library includes the unittest module, which is a simple framework for creating and running a test suite.

import unittest
class MyTest(unittest.TestCase):
  # any function named test* will be run as a single test case
  def test_arrayindex(self):
    self.assertEqual([1, 2, 3][0], 1)

pymake uses the unittest module to test the data model and parser: tests/datatests.py and tests/parsertests.py.

One annoying limitation of the unittest module is that is difficult to construct a set of test cases that run the same test code on different input data. To solve this problem, I wrote a multitest helper function. The developer writes a class with a testdata dictionary and a runSingle method, and multitest will create a test function for each element in the test data:

def multitest(cls):
    for name in cls.testdata.iterkeys():
        def m(self, name=name):
            return self.runSingle(*self.testdata[name])

        setattr(cls, 'test_%s' % name, m)
    return cls

class TokenTest(TestBase):
    testdata = {
        'wsmatch': ('  ifdef FOO', 2, ('ifdef', 'else'), True, 'ifdef', 8),
        'wsnomatch': ('  unexpected FOO', 2, ('ifdef', 'else'), True, None, 2),
        'wsnows': ('  ifdefFOO', 2, ('ifdef', 'else'), True, None, 2),
        'paren': (' "hello"', 1, ('(', "'", '"'), False, '"', 2),
        }

    def runSingle(self, s, start, tlist, needws, etoken, eoffset):
        d = pymake.parser.Data.fromstring(s, None)
        tl = pymake.parser.TokenList.get(tlist)
        atoken, aoffset = d.findtoken(start, tl, needws)
        self.assertEqual(atoken, etoken)
        self.assertEqual(aoffset, eoffset)
multitest(TokenTest)

Tests Allow For Simple Refactoring

Every project I’ve worked on has had to refactor code after it was first written. Sometimes you know you’ll have to refactor code in the future. Other times, you discover the need to refactor code well after you’ve started writing it. In either case, the test suite can allow you to perform large-scale refactoring tasks with confidence. Two examples will help explain how refactoring was important:

Makefile Variable Value Representation

VAR = $(OTHER) $(function arg1,arg2)

Makefiles have two different “flavors” of variables, recursive and simple. When I first started pymake, I decided to parse recursive variable declarations “immediately” into an Expansion object. This worked well, and it made reporting the locations of parse errors easy.

Unfortunately, there is a case where you cannot parse a variable value immediately:

VAR = $(function
VAR += arg1,
VAR += arg2
VAR += )

In this case, VAR cannot be parsed until it has been fully constructed. Fixing this case involved changing the entire data model of variable storage:

Revision 64 (348f682e3943)

Adding a makefile test for the failing case.

Revision 67 (63531e755f52)

Refactoring variable storage to account for dynamically-composed variables.

Independent Parsing Model

Because parsing doesn’t perform very well, it’s good to optimize it away when possible. The original parsing code when through each makefile line by line and inserted rules, commands, and variables into the makefile data structure immediately. This makes it difficult or impossible to save the parsed structure and re-use it. On Friday I refactored the parser into two phases. The first phases creates a hierarchical parsing model independent of any particular makefile. The second phase executes the parsing model in the context of the variables of a particular Makefile.

After first implementing this change, I found one serious error: I was associating commands with rules without considering conditionals such as ifdefs:

all:
command1
ifdef FOO
command2
else
command3

Fortunately, tests/ifdefs.mk was already in the testsuite, and detected this error. Fixing it required reworking the parsing model with an extra execution context to correctly associate commands with their parent rules.

Secondly, after committing the parsing model, I found an additional regression when building Mozilla: the behavior of “ifndef” was reversed due to an early return statement. I was able to add an additional test and a simple fix, once I figured out what was going on.

pymake status

pymake features implemented since last week:

  • Implement $(eval):
  • 122:1995f94b1c2f: Implement the vpath directive
  • 123:17169ca68e03: Implement automatic wildcard expansion in targets and prerequisites. I hate this, but NSS uses it, and I hate NSS more.
  • 135:fcb8d4ddd21b: Run submakes within the same process if possible
  • parallel-execution branch: Parallel execution of commands (-jN)
  • 156:3ae1e58c1a25: Cache parser models (avoid reparsing rules.mk)
  • win32-msys branch: Ted has pymake working on Windows. It doesn’t build Mozilla yet because we leak MSYS paths into makefiles, but that shouldn’t be hard to fix.

pymake: A Mostly GNU-compatible `make`, Written in Python

Friday, February 13th, 2009

Mozilla has a lot of Makefiles, and although we’re not completely happy with our current recursive-make build system, converting to something else is a very difficult task. In order to consider other solutions, we have to have a seamless migration path from our existing infrastructure. Unfortunately for any potential migration, Mozilla uses pretty much every make feature and quirk imaginable. It isn’t sufficient to get by with some simplistic importer: any migration needs to understand the arcane details of makefile syntax and parsing rules.

So finally, after two years of considering whether it’s feasible and a good idea, I bit the bullet and wrote a make replacement in python. It was almost as mind-numbingly difficult as I thought it would be. But after two solid weeks, I have a tool which will mostly build Mozilla, and seems to be doing it correctly.

Why would you do such a thing?

Mark Pilgrim was right: only a crazy person says “I’m going to build my own build system”. So rather than creating a whole new build system with a new syntax, I wanted to start out replicating an existing build system exactly. Then we can make incremental changes if appropriate.

What kind of incremental changes?

First up are speed increases. Our Windows builds are currently very slow, in part due to the large number of processes that are spawned. There are several ways to attack this:

  • Each makefile command currently spawns an MSYS shell and then (usually) the actual binary. The MSYS shell is expensive and in most cases unnecessary. pymake should be able to skip the shell for simple command lines.
  • Mozilla runs `nsinstall` a lot. Axel has already implemented nsinstall as a python module: our makefile rules should be able to run commands in python and skip the external process.
  • We can use recursive make without launching additional make processes by recursing within the same pymake process.

Where can I see this monstrosity?

The pymake website. You can also just pull or browser the pymake sources directly from hg.mozilla.org using Mercurial.

Why don’t you just hack GNU make itself?

There are some improvements we’re interested in doing which just aren’t possible within a C codebase:

  • implementing rules directly in Python
  • Condensing multiple invocations of make into a single process

. Python is also about a zillion times easier to hack quickly.

pymake is mostly GNU-make compatible. What are the differences?

The known differences are documented in the README file.

Does it perform well?

Not yet. pymake currently takes more time than GNU make. Almost all of this time is spent reading and parsing makefiles. Parsing autoconf.mk/config.mk/rules.mk from the Mozilla tree takes 0.3 seconds on my Linux machine, where GNU make takes 0.03 seconds. Dependency resolution and execution is as fast or faster than GNU make. I’d love to have some help profiling and fixing the parsing. Also, pymake does not yet support parallel execution.

Next week sometime, I promise to write about some of the difficulties I encountered writing pymake, and how test-driven development saved me from terrible pain and suffering.

There is No Invalid HTML

Thursday, January 15th, 2009

There are many different kinds of user agents for HTML markup. The most popular and most important user agents today are display-based web browsers. There are other types of HTML consumers, such as search engines, aggregators, text-based browsers, and speech browsers. There are also many different types of producers of HTML.

One of the the most important features of the HTML5 specification from the perspective of user agents is that it specifies how to parse and consume all markup, not just correct markup. The vast majority of markup on the web is not “valid”. This will undoubtedly continue to be true, and it’s not a bad thing: imagine a dystopian world where only complex tools or skilled technicians could create web content!

The HTML5 parsing specification contains rules to transform any possible sequence of characters or bytes into a standard document object model. From conversations with Ian, I believe this was one of his primary goals for the initial HTML5 specification. I’m a little surprised that this is not called out more clearly in the parsing section of the specification.

Setting aside the unanswerable questions of whether generic metadata can be used to solve problems at web-scale, or whether RDFa can solve the metadata problem, most of the discussion on the WhatWG mailing list regarding whether RDF (RDFa) should be integrated into the HTML specification has focused on whether the RDFa would make the markup invalid HTML.

If everyone actually implements the HTML5 parsing specification, who cares whether it’s valid markup? You get the same document structure in every case. User agents which are aware of RDFa and wish to use it to solve problems may do so. This seems like the ultimate extensibility mechanism you could possibly want. The only “problem” is that a validator (conformance checker) will warn you that you’ve produced invalid markup.

Perhaps, rather than ranting about invalid markup, the specification should be altered. Remove all references to parse errors, and instead require parsers to interoperably transform any possible sequence of characters into the same document model. Then those who wish to use RDFa markup may do so, and user agents can ignore or process this metadata as appropriate.

One possible objection to this error-less regime is that it doesn’t give useful guidance to content authors or authoring tools. If everything is valid HTML markup, there still should be best practices for authoring HTML. If you want your content to work with existing browsers, reverse-engineering existing practice is a tricky exercise. Perhaps there should be a section of the HTML5 specification indicating how authoring tools should interoperably serialize a given document model. Then it would be relatively simple to write a fuzz tester to take a document, serialize it per the serialization spec, re-parse it per the parsing spec, and then compare the results for equality.

The discussions about HTML and RDF have drifted from practical interoperability into theoretical “validity”, purity, and architectural grandiosity. Let’s get back to the specific technical question that needs to be answered: if you embed RDFa in HTML5, does it parse into a usable DOM? If not, are there specific changes to the parsing specification that will allow it to parse to a usable DOM?

Seven Things You May Not Know About Me

Monday, January 12th, 2009

I got tagged by Dave Townsend. I like this meme: I think it’s important for people in the online world to know about the other people they see and work with, so that we’re not so one-dimensional.

Rules

  1. Link back to your original tagger and list the rules in your post.
  2. Share seven facts about yourself.
  3. Tag some (seven?) people by leaving names and links to their blogs.
  4. Let them know they’ve been tagged.

Seven Things

  1. For five years I was a full-time organist and choir director at St. Patrick’s church in Washington, D.C. If you look through the archives of this blog you can see some old posts that are related to my former profession. I switched careers mainly for financial reasons, and eventually I hope to switch back.
  2. Other than a fourth-grade typing class, I have never had any formal education in computers or programming.
  3. I am the oldest of seven siblings. My youngest sister is 20 years younger than I.
  4. I met my wife at the Friday Night Dance at Glen Echo Park, Maryland.
  5. My wife and I have five children, with another on the way in April. I have posted a birth announcement for each on this blog:
    1. Ellie
    2. Claire
    3. Abigail
    4. Micah
    5. Madeleine
  6. I am a published composer, including a tune for a hymn that my parish commissioned to honor St. Thomas More. See Sacred Music Vol. 129 No. 1.
  7. I was home-schooled.

Tags

Mozilla Layout Classes: nsIFrame class hierarchy

Friday, January 9th, 2009

The Mozilla layout code uses frame objects to lay out the DOM on the screen. (These are entirely different from &lt:frame> nodes in the DOM). All Mozilla frames inherit from a single C++ abstract class nsIFrame. As part of a project I’m working on to separate the frame classes from XPCOM, I used dehydra to generate a graph of all the frame types in Mozilla.

See the graph in SVG format.

What I really want for Christmas is a web-based interactive graph viewer for this type of content. I’ve seen a couple closed-source things in Java, but nothing really exciting or hackable.

Note: Viewing this graph in Safari won’t work, because the image is much larger than a single screen, and Safari doesn’t provide scrollbars for SVG that overflows… feel free to download it and view in inkscape, though, or get Firefox!.

How to Auto-Start Synergy Correctly on MacOS 10.5 (Leopard)

Wednesday, January 7th, 2009

Synergy is a wonderful tool: it lets you run multiple computers on your desktop, but use a single keyboard/mouse. The mouse cursor moves between screens seamlessly. Configured correctly, you even have a unified clipboard to enable copy/paste between machines, even running different operating systems. I have two Linux desktops and a Mac laptop on my desk and Synergy is a life-saver.

Configuring synergy and getting it to auto-launch correctly is not especially simple: the synergy autostart manual lists three ways, none of which worked for me to get the clipboard working. But I found, hidden in the Synergy bug database, a configuration script which solves the problem perfectly on MacOS 10.5 (Leopard):

  1. In a terminal, become root: sudo su -
  2. emacs /Library/LaunchAgents/org.gnu.Synergy.plist
  3. I have synergy at /builds/synergy-1.3.1. My mac client is named dupre. My Linux server is named gabrieli. Adjust these values accordingly:
    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
      <key>Label</key>
      <string>org.gnu.Synergy</string>
      <key>ProgramArguments</key>
      <array>
        <string>/builds/synergy-1.3.1/synergyc</string>
        <string>-n</string>
        <string>dupre</string>
        <string>-1</string>
        <string>-f</string>
        <string>gabrieli</string>
      </array>
      <key>KeepAlive</key>
      <true />
      <key>LimitLoadToSessionType</key>
      <array>
        <string>LoginWindow</string>
        <string>Aqua</string>
      </array>
    </dict>
    </plist>

Log out and log back in to get everything started. This will launch a synergy client for the initial login desktop. At login this client will be killed, and a new client will be started for the user’s desktop. Note: LaunchAgents are new in MacOS 10.5. This technique won’t work in earlier versions.

How to disable the Comodo reseller root certificate in Firefox

Wednesday, December 24th, 2008

Slashdot is a-buzz (and rightly so!) with news that people have been able to obtain an SSL certificate for a domain they don’t own, by applying with one of Comodo’s certificate resellers. It is clear that there has been a major breach of trust, but we’re not sure of the best general solution. There has been a discussion in the mozilla.dev.tech.crypto newsgroup about what steps Mozilla should take for this breach.

In the meantime, I recommend disabling the root certificate used by this certificate authority, to avoid the possibility that other fraudulent certificates are floating around in the wild. Here’s how to disable the relevant CA root in Firefox:

  1. Open the preferences window
  2. Select the “Advanced” tab
  3. Select the “Encryption” sub-tab
  4. Choose “View Certificates”
    Firefox Preferences Window: Advanced -> Encryption -> Certificates

  5. Find and select the “AddTrust AB / AddTrust External CA Root” item
  6. Choose the “Edit’ button
    Root Certificates Dialog

  7. Remove all trust setting check-boxes.
    Edit Certificate Dialog

Note: disabling this root certificate will SSL websites validated by this Comodo reseller to stop working. That’s why you’re doing it, but if it’s your favorite website that stops working, please don’t blame me! If you’re really paranoid, you could also disable all Comodo roots: these include all the certificates with names like “AddTrust”, “Comodo CA Limited”, and “The UserTrust Network”.

Thanks to Eddy Nigg for first providing these instructions.