RDF has been touted as the data model to model all others; the way to represent all metadata on the web. For those of us who are “architects” at heart, this is an extremely attractive proposition. The problem is that it is destined to fail, for technical and human reasons.
Let us examine the technical issues first. What are the chief advantages claimed by RDF?
- Extensibility: RDF graphs can represent any data concept if there is an appropriate schema, and anyone can create a schema without conflicting with other schemas
- Aggregation: RDF can combine information from multiple sources, to combine and enhance knowledge
The technical problem is that you cannot achieve both of these goals at the same time. Any RDF aggregator must understand the data schemas being used, or the aggregation is worse than useless. For example, imagine two RDF graphs, both containing a sequence:
r:foo rdf:_1 r:obj1
r:foo rdf:_2 r:obj2
r:foo rdf:_3 r:obj3
r:foo rdf:type rdf:Bag
|
r:foo rdf:_1 r:obj1
r:foo rdf:_2 r:obj4
r:foo rdf:_3 r:obj5
r:foo rdf:type rdf:Bag
|
The logical aggregation of these graphs is:
r:foo rdf:_1 r:obj1
r:foo rdf:_2 r:obj2
r:foo rdf:_3 r:obj3
r:foo rdf:_4 r:obj4
r:foo rdf:_5 r:obj5
r:foo rdf:type rdf:Bag
However, let us imagine the same graphs, except that r:foo
is now a Sequence instead of a Bag:
r:foo rdf:_1 r:obj1
r:foo rdf:_2 r:obj2
r:foo rdf:_3 r:obj3
r:foo rdf:type rdf:Seq
|
r:foo rdf:_1 r:obj1
r:foo rdf:_2 r:obj4
r:foo rdf:_3 r:obj5
r:foo rdf:type rdf:Seq
|
The logical aggregation of this graphs keeps a “r:obj1” in the graph twice, because a Sequence is order-sensitive:
r:foo rdf:_1 r:obj1
r:foo rdf:_2 r:obj2
r:foo rdf:_3 r:obj3
r:foo rdf:_4 r:obj1
r:foo rdf:_5 r:obj4
r:foo rdf:_6 r:obj5
r:foo rdf:type rdf:Seq
There are, of course, much more complicated examples, and there is frequently room for multiple interpretations of how to aggregate the same data. The basic point is that given a set of graphs, an automated tool cannot make an intelligent aggregation if them without understanding the schemas involved. This means that RDF is really either extensible or aggregatable, but not both.
Secondly, and much more important, is the human factor involved in metadata. It is a basic fact of life that what is not seen, is not updated. Web authoring tools like NVU know this, and so they prompt you to enter the <title> of a page when saving it. This makes the invisible visible. RDF has no visual representation, by design. It is intended to be processed by automated tools (of unkown type) for an infinite variety of purposes. This means that humans never see the metadata, and will therefore never maintain the metadata.
The obvious solution to this problem is to make the metadata visible. So what we really need, say the RDF aficianados, is a data-browser. This browser will allow you to “surf” the metadata of the web, just like the web browser of yester-year allows you to surf HTML. This is a fine idea, except that nobody has apparently tried to integrate this “metadata browser” with an actual browser. Unless you do that, you are doomed to niche or non-existent adoption, and all the fancy theories in the world are useless.
So what’s the solution? My solution is simple: extend the HTML <link> and <a> tags with additional profiles (a la XFN), so that metadata is attached to visible entities if possible. Define a standard so that RDF can be embedded in the HTML head element, and brainstorm some extensible rendering framework so that browsers can actually display RDF metadata.
This was, I believe, one of the original inspirations of the RDF/Aurora code in Netscape 6, which was quickly swallowed by extreme and bizarre demands that RDF define the UI of the Netscape browser. Since we are now agreed that RDF was generally a mistake as a general-purpose data-binding language, and we are revamping the XUL templates to work with simpler XML/JS/SQL data, we can actually work on a real intertwingular metadata browser tightly integrated and extending the basic functions of the HTML browser that have been so universally successful.