Saturday, August 12, 2006

ROA and Microformats

The most recent feedback I've been getting on my ruminations regarding the Resource Oriented Architecture have been mostly concerned with the programmability of the web. In its vanilla state, the web is very easy to program. Basically, all the computer program needs to know how to do is identify the resource and then send it one of the three or four rigid, predefined messages that apply no matter what. These messages are:
  1. Add this resource
  2. Give me the representation of this resource
  3. Make the resource state transition
  4. Destroy the resource

Works like a charm every time. The beauty of this model is that it is unbreakable. Adhering to this model, one will never be forced to go through the pain of 'upgrading the web'. One's code will keep working no matter what.

Is Simplicity the Problem?

To talk to most web developers, you'd get an impression that this beautiful simplicity is more of a problem than a solution. Basically, it all boils down to the fact that programmers complain how this protocol (we're talking HTTP here) is too plain. It doesn't give them the 'power' they're used to when working with the Java API, or with .NET API and so on.

For example, in Java we have an open-ended world of unlimited custom made, home grown protocols. Anyone in the world is free to create their own monster mash, and invent their own capabilities and name them however they feel like. This is what programmers call 'power'.

But that's what I call 'weakness'. Why? Simply because it's so goddamn confusing. How can a thing that's so confusing be considered powerful?

The Problem of Discoverability

Some developers do recognize this problem (i.e. the problem with open-ended, unlimited world of home grown capabilities). Yes, it may be wonderful to have this vast world of incredibly sophisticated capabilities, but what's the point if no one knows about them? It would be absolutely unrealistic to expect that there be a central control instance that would maintain the world-wide inventory of all the ever growing capabilities that are being added to the web daily.

So instead of abandoning the wild geese chase, these architects suggest we use methods of piecemeal discovery. Various techniques have been proposed to that end: reflection, introspection, Web Services Description Language (WSDL), god knows what else. None of these really work, because even after you've discovered that there is a capability out there you had no idea existed, there isn't anything you can do to use it. This is because, while you may be able to discover the remote procedure call signature of that capability (i.e. how to call it, what types of parameters is it expecting, and what type of parameters is it returning), you still have absolutely no way of deciphering the meaning of that capability. What does it really mean, what does it really do?

You could always assume, but there is inevitably a big ass in every assumption.

It is very hard trying to interpret the intentions that some content conveys by relying on the formally measurable parameters. That would be akin to trying to figure out whether a person likes something or not by measuring that person's pulse, blood pressure, blood sugar level, brain wave activity, etc. Sure, all these things are measurable, but are they really conducive to attaining unambiguous conclusion?

Work from the Known, not from the Assumed

All the RPC methodologies prefer to work from the assumed standpoint. In other words, the RPC client prefers to engage the server in a preliminary conversation. The conversation goes something like this:

Client: "Hi, I am about to request that you render a service for me. Could you please tell me what you're capable of?"

Server: "Hi there, I offer wide variety of top-notch services for your exquisite enjoyment. What would be your pleasure today?"

Client: "Oh, I was hoping that you could help me convert inches to centimeters. Can you do that?"

Server: "Here is the list of things I can do (offers a long list of convoluted names)."

Client: "OK, let's see... (tries to find the name that would resemble the inches-to-centimeters conversion)"

Once the client makes a decision, the real conversation commences, meaning the real data may be exchanged.

In contrast, resource oriented client does not engage the resource in any sort of preliminary chit-chat. The client simply identifies the resource and asks it to send its representation to the client. The client examines the received representation and decides to either give it a miss, ask the resource to make a state transition, or ask it to destroy itself (or perhaps ask it to add a new resource). Simple as that. The conversation between the client and the resource commences right out of the gate. There's no pussyfootin'.

The Problem of Enriching the Protocol

So discoverability didn't really get us anywhere, nor could it ever do so. People are slowly but surely beginning to reach the conclusion that it is much safer and ultimately much better to ask the resource for its representation, than to interrogate it about its dubious capabilities. At least by sticking to the representation model, we know that our request will always get serviced in the predictable way.

But the problem now seems to be that the representation of the resource is not structured enough. What does that mean? Let's go back to my tennis court example for a minute -- if we identify certain tennis court in our town, and request to get its representation, the response will travel to our client and will be rendered for our consumption. We will then be able to read about it in more details. For instance, we may be able to see that this tennis court is not booked on Saturday morning, which is exactly the information we've been looking for (i.e. we've been searching for a tennis court in our town that would be free this coming Saturday morning).

So right there we see that this resource (i.e. tennis court) is endowed with the capability to be in a booked or free state. And that's all we need to know in order to fulfill our goal (and thus we'll find the web site that's hosting this resource to be very useful to us).

Now, most programmers see this situation as being very problematic. Basically, they are complaining that this representation of the resource is only human-friendly, and that machines have been left out of the equation. The highly unstructured content of the resource's representation may be fine for humans, but is all but useless for the machines.

Because of that, they propose that the rock solid HTTP protocol be enriched, ameliorated, and opened up for allowing us to enforce more structure upon the content of the resource's representation.

How do they propose to do that? Microformats is one way that seems to be getting many people's hope quite high. So let's look at how do Microformats propose to enrich the HTTP protocol.

The 80/20 Myth

Microformats offer a very non-intrusive approach to ameliorating the protocol. That approach is based on the more 'organic' view of things. In other words, it's bazaar rather than a cathedral, a garden rather than a crystal palace.

The so-called Zen of Microformats states that it only makes sense to cope with the 80% of the problem space, and leave the remaining 20% of the unsolved portion to take care of itself.

This, of course, is very reasonable. It is rather unacceptable from the engineering standpoint, but we all know by now that software development is as close to engineering as tap dancing is close to Dave Chapelle's Block Party.

In the nutshell, then, Microformats propose to open up the playing field for structuring the wild and woolly content as it is being served on the web as we speak.

Right now, it is possible to see some of the Microformats in action. Plenty of good ideas that definitely add value to the meaning of the structure of the resource representation.

So where's the problem? It's in the unsubstantiated belief that this additional structuring of the resource representation will catch on in approximately 80% of the cases. My hunch is that this expectation is hugely blown out of proportion.

The Selfish Web

One of the fascinating qualities of the web is that it offers one of the most altruistic experiences that emerge out of the most selfish motives. This is called 'harvesting the collective intelligence'. Each individual on the web pursues his/her proprietary, selfish goals, and yet the community tends to benefit vastly from such selfish pursuits.

But it would behoove us to keep in mind that, on the web, work avoidance is the norm. People mostly blurt things out on the web and then go on their merry ways. No one has the time nor any intention to stop and carefully structure their content.

Presently, the content offered on the web is at best structured to offer the following semantics:
  • HTML head with a half-meaningful title (hopefully)
  • body with (hopefully) only one H1 (heading one) markup tag
  • ordered/unordered lists enumerating some collection
  • divisions with semi-meaningful class names and ids
If one is extremely lucky, one may find an HTML representation of a resource that offers such well-formedness. But in most cases, the representations we do get are even below such extremely lax standards.

How are we then to expect that Microformats will pick up and reach the 80% of all representations? I think it's a pipe dream. I am doubtful that Microformats will ever reach even 20% of the representations out on the web. I hate to say this, but I'm afraid that we're more realistically looking at 2% to 5% rate of adoption.

Only time will tell, as always.

No comments:

Post a Comment