What’s wrong with RDF?

Yesterday I stumbled again on Ruby’s post: Quantifying the “RDF Tax”. Sam is a very practical guy and I will give him that the Atom XML format is short and sweet, but I think Sam, Dare and others are optimizing too much how much tax they want to pay every month vs. how much they could get back at the end of the year (for a poor analogy). Even worse, they don’t know about the kinds of incentives the government might be offering in the future if we started using more RDF-friendly formats. This is what Dare had to say in one of his comments: Dare: I’m still waiting for anyone to give a good reason for Atom being RDF compatible besides buzzword compliance. Since you seem interested in making this happen can you point out the concrete benefits of doing this that don’t contain the phrase “Semantic Web”? This kept resonating in my head throughout the day especially as I see more XML-based formats (SPARQL Query Results) being defined every day. I cannot but wonder how in the world we will consume all of this data in more general applications. At work, we were writing a demo RSS Reader to show off some of our Eclipse-based tooling for RDF and a colleague asked me that he had to make up his own schema/namespace for storing external feeds. I hesitated at first, but since we are not inference enabled yet and this is only a demo, I concurred. This sounds crazy, I know, but I have found to be true for almost every feed reader and parser that I have looked in-depth in the past: they make up their own store to aggregate the different feed formats. Now of course, this is only when dealing with feeds. But as Jeff mentions, Web 2.0 is a much bigger issue when it comes to data formats: Jeff Jarvis: But Web 2.0 adds on the wonders of the latter: feeds (RSS, Atom, FeedBurner, et al); lists (OPML, etc.); conversations (blog posts, Technorati links, PubSub feeds, comments); swarming points (tags on Flickr, Del.icio.us, Technorati, Dinnerbuzz); heat sensors (Blogpulse et al); aggregations (e.g., Command-Post.org); communities (Craig’s List, et al); alerts (Craig’s List feeds); decentralized distribution (bittorrent, etc.); and on and on. I just don’t think that the Atom folks intended for Atom to be able to express all of the semantics needed for the wide variety of concepts being captured by this new generation of apps in Web 2.0 and neither was the case for Atom extensions. Atom undoubtedly will be the format and API of choice for all these content types, but its design was to be the minimal amount of metadata to communicate information and not a rich semantic framework to express it all. Also, please don’t get me started on the Corante folks who are more than obsessed with tags and how they’ve tried every variation of tagging to aggregate information, with tags such as: “hey-technorati-read-this-post-it’s-about-my-daughter-eating-ice-cream-but-delicious-you-shouldn’t-read-it-::family::-that-last-tag-is-for-flickr-though”. Please folks, tags are just the tip of the iceberg, we need to start exploring richer metadata models and RDF is a good start. Come everyone, can’t we just get along? Finally, I’ll leave you with a quote from Danny on how I see myself building next generation Web 2.0 applications that don’t have to hard-code formats in their code. Danny Ayers: I know how I’m going to support these multiple extensions in my own code. I’ll have the RDF model internally and map to it (XSLT for now), taking advantage of subclass/subproperty inference for dealing with the semantic differences between the different formats (as Suzan suggests). SPARQL will allow me to query across the diffferent properties out of the box. Best of luck to everyone else.

About this entry