Resource Oriented Architecture: roa

Showing posts with label roa. Show all posts

Friday, June 20, 2008

Software Development Detox Part 5: Session

In my previous installment (oh-my-god, it's already been a year!), I've discussed the hard-wired need to retain central point of control when building software. Today, I'm going to examine the central software concept that implements this urge for control: session.

Session is a vessel used for retaining the memory of all the events that had occurred during the software interaction. It often gets implemented with the intention of providing continuity during the interaction.

Because of its central position in the overall architectural solution of a networked software, session as a concept is very brittle. It can quickly gain entropy and can end up costing a lot in terms of computing infrastructure.

On the web, however, since the prevailing architecture is stateless, the concept of session is entirely unnecessary. The memory of an interaction that may occur on the web need not be retained by any participant.

Realization of this simple dictum is proving to be extremely difficult for many software developers. They feel that, without having the crutches and training wheels that the concept of session provides, they will have no way of knowing how to make a decision on what to do next. At least when they use the state of the conversation, as recorded in the session, they can write some logic that would help them execute some meaningful actions. But without that knowledge available to them, they feel lost.

This sentiment bellies the lack of understanding of the web architecture. On the web, each request contains all the information necessary for the server code to make a decision on what to do next. There is absolutely no need for the server to keep track of any previous requests. Not only is such a task exorbitantly expensive, it also creates a single point of failure, which is the primary cause of faulty software being deployed live worldwide.

So the advice to all budding software developers is: abandon the idea of a session, and rely on your own wits when building the business logic that drives your site. Going through such software development detox procedure will ensure that you build and deliver robust, lightweight web sites that will be easy to host and maintain.

Sunday, January 20, 2008

Why is Software Development on the Web so messed up?

It's been said that those who do not learn history are doomed to repeat it. In the spirit of absolutely not wanting to go again through the excruciating pain of software development that was the norm in the '90s and the first 5 - 6 years of the 2000s, I am going to dissect the following historical document: XML, Java, and the Future of the Web, published on October 2, 1997, by Jon Bosak.

As I walk through some of the salient points exposed in that portentous documents, I will try to bring to your attention certain glaring errors in understanding and judgment regarding what is the goal of computing and of the information processing on the web, in particular.

Let us then begin with the opening sentence:

"The extraordinary growth of the World Wide Web has been fueled by the ability it gives authors to easily and cheaply distribute electronic documents to an international audience."

Right out of the gate, Mr. Bosak takes the left turn and commits himself to the implementational aspect of the technology he is discussing. The key concept, of course, is 'electronic documents'. In his view, the web is useful in its ability to offer an affordable distribution channels for prefabricated electronic documents.

The problem with the Ivory Tower views of the web is that, being high-tech bent, the high priesthood inhabiting the Ivory Tower tends to view the smallest unit of processable information as an object, such as electronic document.

Following at the heels of the opening salvo, the second sentence in this paper claims:

"As Web documents have become larger and more complex, however, Web content providers have begun to experience the limitations of a medium that does not provide the extensibility, structure, and data checking needed for large-scale commercial publishing."

This is now quite confusing. If the smallest unit of information processing on the web is an electronic document, what need for extensibility, structure and data checking is he talking about?

Let's look at the third sentence; maybe we'll find some clarification in there?

"The ability of Java applets to embed powerful data manipulation capabilities in Web clients makes even clearer the limitations of current methods for the transmittal of document data."

Huh? Java applets? OK, this is completely obsolete, but still belies the unexplained bias toward powerful data manipulation capabilities in Web clients.

In the fourth sentence (and opening a new paragraph), the author introduces the concept of Extensible Markup Language, and explains:

"... an Extensible Markup Language (XML) for applications that require functionality beyond the current Hypertext Markup Language (HTML)."

What is not clear yet is what kind of "functionality beyond the current Hypertext Markup Language (HTML)" is he referring to?

HTML -- imaginary limitations

After delivering the opening salvo, as exposed above, the author goes on to elaborate on the severe constraints that HTML introduces and how debilitating these constraints appear to him. Here is the author's list:

Lack of Extensibility
Absence of Structure
Inability to perform Validations

Extensibility

This feature is the holly grail to developers who are keen on creating their own, proprietary protocols. The much celebrated power that Extensible Markup Language (XML) brings to the table -- namely extensibility, is at the same time the most troublesome aspect of the failed attempts to engage in collaborative computing. If anyone and everyone is free to use the technology that enables them to daydream any protocol they wish (in effect creating their own unique language), the Babilonian confusion created that way must quickly reach cosmic proportions. This is exactly what's happening right now on the web -- everyone feels entitled to dream up their own bloody protocol, and the potential online conversation tends to fall mostly on deaf ears.

The author states this clearly: "HTML does not allow users to specify their own tags or attributes in order to parameterize or otherwise semantically qualify their data." The key phrase is "their own" -- which amounts to "invent your own language".

Structure

Similar to the old adage 'form follows function', structure typically follows the meaning of the information. Mr. Bosak's lament in this respect is largely unfounded, since HTML does offer fairly comprehensive structure for expressing the semantics of the represented information.

Whether the author was aware of that ability to express semantics using HTML, remains a bit of a mystery. In any event, what the author seems to be complaining about with regards to the lack of structure in HTML is the erroneously perceived need for representing complex hierarchical dependencies.

There is really no evidence that an average consumer perusing the web resources has such a pressing need for consuming large and increasingly complex hierarchical constructs. In author's own words: "HTML does not support the specification of deep structures needed to represent database schemas or object-oriented hierarchies."

Yes, the above is true, but the question is: why would anyone have the need to consume 'deep structures, object-oriented hierarchies, or database schemas'? The question, of course, remains unanswered.

Validations

Let's examine Mr. Bosak's objection regarding HTML's inability to perform validations: "HTML does not support the kind of language specification that allows consuming applications to check data for structural validity on importation."

I must admit that I'm a bit confused -- why would anyone use a semantic markup language, such as HTML, to perform validations on 'importation'? Importation to where?

Responsibilities of the Web Client

The author then enlists some imaginary responsibilities that are, in his view, mandatory for the web client:

Web client may have to mediate between two or more heterogeneous databases
There is a need to distribute a significant proportion of the processing load from the Web server to the Web client
Web client often has to present different views of the same data to different users
Intelligent Web agents will attempt to tailor information discovery to the needs of individual users

The above list belies a strong desktop-centric approach to software development. All of the 'needs' listed in the four items above fall naturally into the server-side jurisdiction. The client is only responsible for rendering the representation that the server side business logic prepares.

Linking

Toward the end of his paper, Mr. Bosak addresses some perceived inefficiencies that he sees in the hypertext system as implemented using HTML. His list is, again, very revealing. Here is his list of perceived inefficiencies when implementing linking via HTML:

HTML lacks location-independent naming
It lacks bidirectional links
It lacks the ability to define links that can be specified and managed outside of documents to which they apply
No N-ary hyperlinks (e.g., rings, multiple windows)
No aggregate links (multiple sources)
No transclusion (the link target document appears to be part of the link source document)
No attributes on links (link types)

Below is my response:

Location-independent naming: Huh? If I link to a resource representation, such as google.com, how's that location dependent? Where in that hypertext link do we see any information that would reveal location of the resource representation named google.com?

Bidirectional links: This is the engineer's wet dream; having links that will never get broken. Not gonna happen, time to grow up and deal with reality.

Links that can be specified and managed outside of documents to which they apply: It's the damned documents again! Snap out of the document dogma and start living in the world of resources and their representations. In other words -- content!

N-ary hyperlinks (e.g., rings, multiple windows): Not sure what this means, but I really cannot see any real life use for it.

Aggregate links (multiple sources): OK, 10 years ago no one apparently knew about real simple syndication and such, plus the aggregators etc.

Transclusion (the link target document appears to be part of the link source document): Also, no clue about the mashups that will become all the rage 10 years later...

Attributes on links (link types): What on earth does that mean? Link types? Like what?

Giving Java something to do!?!

Finally, the author cites the slogan "XML gives Java something to do" (no sources quoted in the original paper).

What does he mean by this battle cry? What would he envision Java will do with XML?

It's hard to say after more than 10 years had passed since that battle cry was uttered, but my hunch would be that he was referring to the 'nomadic' nature of Java. In the olden days, Sun Microsystems was selling the "network is the computer" slogan, and so at that time Sun's trailblazing language was Java. According to Sun, Java lives on the network, and is nomadic, courtesy of RMI (Remote Method Invocation). Snippets of Java code can travel the network, on demand, in order to supply just-in-time computations on the requesting client.

Supposing that the client had just received an XML document but doesn't know how to present it to the user, a snippet of Java code could be invoked and downloaded on a whim. Once installed on the client, Java routine will massage the XML etc.

This dated model of 'distributed computing' has long been pronounced dead and buried. The problem is, we still have armies of software developers who tend to think in a similar vein. Because of that, software development on the web has turned into a terrible chore, and the resulting web sites tend to be nightmarish.

The moral of this story is to learn the lesson from the history, become aware of the blind alleys that have been paved by the developers a decade ago, and learn how to avoid the minefields, pitfalls and traps of the obsolete software development practices.

You're welcome!

Friday, December 21, 2007

Resource Representation and Discoverability

We now seem to be entering the phase where we are in the process of reclaiming the web. Who are we reclaiming it from? Why, from no one else but the hordes of software developers.

The web has been hijacked by software developers. It is only natural that the people who dedicate their lives to becoming as intimate with the machines as is humanly possible get to be the people who take center stage in building the web. But, there is trouble in paradise. The web is, in its essence, the exact opposite of the machines. The web is all about humans.

Now, software developers are humans, however they are humans who value serving machines more than they value serving other human beings. Because of that, the web is a horrific place right now, fully dedicated to serving the machines.

This terrifying state of affairs must change. And the only way it will change is if humans reclaim the web, pry it out of the slaves-to-the-machine greedy hands, and install it at its proper place -- as a platform for serving the needs of the human mind and the needs of the human social dimension.

Resources and representation

As has been argued elsewhere, resources on the web are mere abstractions. They are not the tangibles that one could indulge in. At best, one can hope to consume the representation of the abstract resources scattered around the web.

It is often erroneously assumed that resources are directly exposed on the web, in their raw form. One example would be the unanimous conviction that URLs are web's native resources. Thus, a URL such as craigslist.org is viewed as a web resource.

Nothing could be further from the truth. A URL (such as craigslist.org) is not a web resource, it is merely a representation of a web resource. There seems to be an abstraction labeled as craigslist.org and hosted somewhere on the web. But what that abstraction (i.e. craigslist.org) really is, we have no way of knowing.

What we can learn about is one or more representations of that resource. For example, one representation of the craigslist.org resource is its URL. Another one could be the list of all the sites offered by craigslist.org. And so on.

What's the use of representation?

Some people fail to see the usefulness of the representation. They'd much rather get their hands on the resource itself, instead of beating around the bush of resource representation. The best way to explain this problem is to employ assistance of some heavy duty science:

Map is not the territory

This famous premise was issued by Alfred Korzybski, one of the seminal thinkers who helped shape the communication and information theory.

A map is the representation of the territory. Without the map, we'd be lost when traveling through the territory.

In a similar fashion, resource representation could be viewed as a map that eases our voyage through the territory (i.e. the resource we're exploring).

Very few people tend to question the usefulness of maps. It is our hope that, similarly, people will learn to embrace the usefulness of the resource representation.

Discoverability

The web is intended for human consumption. Humans are notorious for having fairly constrained short-term memory buffer, and are thus forced to consume information in a piecemeal fashion. The architecture of the web is therefore tailor-built to serve exactly that constraint -- consume the resource representation at a fairly leisurely pace. Expecting humans to consume the resource representation in a single giant gulp would be utterly unrealistic, and so the fundamental architecture of the web is based on the principle of discoverability.

What that means is that, on the web, we are serving byte-sized chunks of resource representations. These byte-sized chunks are intended to be consumed by human users pretty much at a glance. Any need for consuming more, for learning more, gets fulfilled by letting the users explore further, allowing them to discover more intricate details of the resource representation.

Away from the sociopath web

Today's web is mostly built by humans who crave serving the machines. As such, the way resource representations are typically architected on the web today belies the optimization to the way machines consume information. There is typically very little discoverability offered on most web sites, and the consumers of the representation are usually expected to digest insanely vast quantities of intricate information in a single gulp.

It is painfully obvious to even a very casual observer that the web today is not tailored for easy consumption by humans. It is as if some sociopaths, who lack any degree of empathy with fellow human beings, have built most of the web sites in operation today. Come to think of it, sociopaths is the fair characteristic of individuals who value interactions with the machines more than they value interacting with other human beings.

It is high time we start working on getting out of this sociopath hell.

Thursday, August 16, 2007

Why is ROA Hard to Understand?

Resource Oriented Architecture is the term I've coined a year ago, only to discover almost immediately afterwards that others have been thinking independently along the same lines. During the past 12 months, I've witnessed numerous discussion threads and debates centered around the importance of ROA. Most of these discussion threads belie complete misunderstanding of the Resource Oriented Architecture, which is very alarming, given that ROA is the native architecture of the web.

And web is the native infrastructure for information dissemination, exchange, and sharing.

Old habits die hard

Most software developers in circulation today have cut their teeth by learning how to implement computing algorithms. Something akin to: "here is a jumbled sequence of numbers, write a computer program that will sort them in ascending order".

The emphasis in such an exercise is always on exhibiting reasonable amount of direct control over the execution logic. The would-be programmers have been groomed for generations to master the skills of gaining full control over the stream of instructions that the CPU executes. In other words, to be a computer programmer/software developer implies to them that a full control over all the processing aspects must be in place.

It's a single point of human control, same as it's a single point of machine control (i.e. the Central Processing Unit, or CPU).

But in the world of the web, such notions of centralized control are not only ridiculous, they are down right harmful. Still, armies of CPU-trained programmers have now been employed and deployed to develop software that will run on the web.

Is it a small wonder then that such 'central point of control' minded individuals are creating fully dysfunctional web sites? Indeed, old habits do die hard.

New habits are slow to emerge

What the above described situation then means is that the existing workforce that is dedicated to developing software programs is pretty much a write-off. Because old habits die hard, it is quite likely that it will be much harder to unlearn the old habits and relearn the new ones, than to start from a clean slate.

However, given the vast demand for web-based software that is out there, we must work on creating some sort of a compromise, in the attempt to bring the crusty old 'central point of control' software development workforce into the web fold. For that reason, we need to continue explaining the difference between the Central Processing Unit software architecture, and the Resource Oriented Architecture.

Transitioning from doing to managing

One of the hardest things to do in life is to make a successful transition from doing something to managing that same thing. The temptation to roll up one's sleeves while managing the process is simply too strong to resist. And that's why not too many people end up being good management material.

In the world of software development, right now there is a huge demand for people who are capable of abstaining from rolling up their sleeves and instead delegating the responsibilities to the 'workforce' out there.

What does that mean? In a more technical parlance, it's been proven that once software developers get trained in making procedure calls when attempting to process some information, they tend to get addicted to that gateway drug. Similar to how to a person with a hammer everything looks like a nail, to a programmer who had learned how to make procedure calls, every information processing challenge looks like a string of procedure calls. This algorithmic frame of mind is very hard to get rid of, once a person gets it contracted. And in order to become an efficient web developer, each and every procedure-besotted person must learn to let go of the procedure call driven world view.

Transitioning from local to distributed procedure calls

Most people learn how to program computers by implementing procedure calls that are local. What that means is that their world view is CPU-centric. In other words, the executable code in its entirety is sitting on a local box, usually right next to the developer's desk. The developer then intuitively grasps the world as consisting of a single Central Processing Unit which acts as a traffic cop, deciding on who gets to do what and when. The 'who' in the previous sentence refers to the procedure. Typically, a programmer will modularize his code into a fairly large number of procedures. These procedures will then be called at opportune moments, will perform the intended processing, and will then recede in the background.

This situation is the prescribed way for teaching newcomers how to develop software. It willfully ignores the real world issues, such as that there is very little use and need in having a 'desert island', 'single sink' scenario, where a computer user will be sitting alone engaged in the 'glass bead game' on the computer. In the world of business at least (and also in the world of entertainment), multiple users are a must. Meaning, CPUs and people will be networked (999 times out of 1,000).

There inevitably comes a time when a software developer wannabe realizes that a transition from a single CPU to multiple CPUs connected via the network, is unavoidable. At that point, the newbie usually gets struck by the immense complexities that such transition entails. And it is at that point that the newbies all over the world decide to drag in the procedure call world view into the networked world.

Now the problem of control in a multi-CPU environment arises. Who's in charge here? It used to be that a single human mind was in full charge and control over the execution of computing instructions, via the single CPU. Now, all of a sudden, multiple CPUs which are distributed across vast networks enter into the equation. The remote nodes start posing the problem when it comes to controlling the proceedings.

The only way for such developers to try and tame this complexity is to wiggle in their chairs and to fabricate some hairy-brained scheme for enabling remote procedure calls (or, RPC). Various mind-boggling protocols get implemented (CORBA, RMI, DCOM, SOAP, SOA, etc.) All of these protocols are incredibly arbitrary, non-compelling, frighteningly complex and brittle.

As we can see, this transition from making local, simple-minded procedure calls to making distributed, remote procedure calls never goes well. In the end, it inevitably results in fragile, brittle software that requires constant duct-taping and chewing-gumming in order to stay afloat.

Who's going down with the ship?

It is obvious to even a casual observer that the majority of software projects, today as in the past, fail miserably. As the complexities of the business scenarios continue to increase at a frightening rate, this rate of failure will only skyrocket.

At this pace, it won't be long before the software development industry turns into another Titanic, if the majority of developers continue to charge down the remote procedure call path. Developers who refuse to go down with the ship have only one option -- abandon the RPC ship and hop into the ROA boat.

But as we've already seen, the transition from navigating a huge, unwieldy ship to rowing a tiny nimble ROA canoe is proving more challenging than most developers are prepared to bear. That's why there must lie a major shakeout ahead of us, a shakeout that will separate the monolithic totalitarian software practices from the nimble, distributed ones. And ROA practices are certainly on the opposite side of the monolithic remote procedure calls practices.

How to make the transition?

It seems to me that, in the process of making the transition from RPC to ROA, the biggest stumbling block for the entrenched software developers lies in the fact that ROA proposes erecting an impenetrable barrier between the client and the resourses on the server. In the bad old RPC world, such barriers are unheard of. In other words, if the client is privy to the protocol that the service-oriented server had arbitrarily and unilaterally made up, the client can intimately manipulate the state of the server.

No such thing is ever possible in the ROA world. And that's the biggest put off for the honest-to-god procedural crowd. They simply don't seem physically capable of conceiving the world where it would not be possible to 'crack the code', learn the protocol, and then rule the server on the other end.

So in order to make a successful transition to ROA, these software dictators must learn how to abdicate from their dictatorial throne. What are the chances of that happening en masse any time soon?

Slim to none? What do you think?

Wednesday, August 15, 2007

Resource Representation

Software developers seem largely incapable of understanding the difference between resources and their representation. Understanding that difference is absolutely vital in understanding REST and the Resource Oriented Architecture (ROA).

Is it a small wonder then that we are beginning to see articles like this one ("Why is REST so hard to understand?") pop up in the publications devoted to the web architecture? It is absolutely clear that we won't be in the position to advance ROA practices unless we reach a critical mass of people who do possess a solid grasp of what is the difference between a resource and its representations.

Right now, it seems like only an infinitesimally small percentage of software developers are aware of the significance of that difference. For example, if we say that there are ten million software developers worldwide, 9,999,000 of these developers are blissfully ignorant that such a thing as resources and their representation even exists. Which leaves us with only about 1,000 developers who truly understand what it all means. Not a very encouraging statistics. As a matter of fact, the situation today is right down dismal.

We need to do something quickly in order to remedy this catastrophic situation. My small contribution here is to attempt to clarify the difference between resources and their representation. I will use mostly concepts from everyday life for illustration purposes. I will then attempt to draw the parallel between these concepts and the concepts one encounters in the arena of software development.

What is a Resource?

Let's start from the top: a resource is anything that humans find useful. For example, a house might be a resource. Or, money might be a resource. We find houses very useful, because they offer certain level of comfort in our daily living. We also find money very useful, because it's a resource that enables us to get other resources, such as houses.

Ten years ago, when I was buying my first house, I wanted to see what kind of a house can I afford. To that end I met with a mortgage broker, who interviewed me in the attempt to assess my buying power. Among other things, he needed to know how much money can I allocate towards the down payment on the house.

Once I told him how much money I have for the down payment, he was able to plug that information into his spreadsheet and then tell me what kind of mortgage do I qualify for. So I was then all set to go out and shop for the house, right?

Well, not so fast! Just because I've blurted out a dollar figure that I had in mind for the down payment, didn't automatically mean that I fully qualify for that mortgage. Because, you see, my word in these matters was viewed as being largely subjective, that is, prone to miscalculations and all kinds of other aberrations. What the bank needed is a more objective assessment of my buying power. Talk is cheap, and anyone can claim that they have one hundred thousand dollars tucked away in their bank account, ready to be plopped in toward the down payment, but the reality might be somewhat different.

Because of that, the banks have devised a more impartial practice, whereby the applicant must provide what is deemed to be a more objective proof that the money really is in the applicant's bank account.

In this example, the money that has been allocated for the down payment on the house is regarded as a resource.

What is a Representation?

Now, the down payment money we are talking about may indeed be sitting in my bank account. But how am I to truly convince the mortgage broker that it is really there? Well, I use the means of representation, that is to say, I use something else that acts on behalf of that money. For example, I use my words to convey the fact that I do have $100,000 in my bank account. My words then represent my money.

But, as we've seen, that representation (my words uttered in a casual conversation) is apparently not good enough for the mortgage broker. He needs something more 'solid' before he is fully convinced that the money is actually there. Meaning, he needs a different form of representation.

Typically, what the mortgage broker would deem a fairly impartial and objective representation of my money is a piece of paper issued by my bank that claims that it is true that I do have $100,000 in my account. In other words, once the mortgage broker receives that piece of paper, he can 'take it to the bank', so to speak.

That piece of paper is what we call a representation of a resource. The real money is nowhere to be found in that representation, yet in a somewhat superstitious manner, all the parties involved almost religiously obey and agree that a piece of paper is as good as the real thing.

Back on the Web

The exact same protocol abides on the web. There, as in everyday, non-virtual life, we have resources which are tucked away somewhere, and all we get to see, feel and touch are the representations. Resources on the web are merely concepts that are being represented using other concepts. On the web, as in real life, no one ever gets to see, hear, feel and touch a resource. All we get is mere representations.

We are free to attempt to manipulate those representations. And even in the cases when our manipulation yields perceptible differences, we have no grounds to claim the we have actually manipulated the resource itself. All we can conclude is that the resource representation has changed under the influence of our actions, nothing more, nothing less.

For example, if there is a resource such as a volleyball tournament published on the web, I can get its representation by sending a request for it. I may then decide to destroy that tournament by manipulating the tournament's representation. And my action may result in an affirmative representation (i.e. the resource may respond by sending me its representation which states that it has destroyed itself). I may then conclude that the volleyball tournament has, indeed, been destroyed. But I really have no way of ascertaining that. All I can claim is that I have acted upon the initial representation of that volleyball tournament, I have sent it the request to destroy itself, and I have received the confirmation that it is now destroyed. However, the resource itself may still be alive and well, existing somewhere beyond the reach of the resource representation. The only thing that is manifestly certain is that from that moment on, the resource will refuse to render its representation for the incoming requests.

This is similar to how, in the real world, no one would ever suggest that I take the mortgage broker into my bank's safety vault and point him to the pile of cash that's sitting there, and tell him: "Here, here's my $100,000, please count them and let's get on with it!" No one goes straight to the resource itself in real life, and the same holds true in the digital life as it gets distributed on the web (not to mention that even the physical cash is merely a representation of some other concept, etc.; one can never possibly get to the bottom of it, that is, the real resource is nouminal, unreachable by us mere mortals).

Understanding this distinction is vitally important if we are to move forward successfully in building the functional web of information. It is my experience that none of the developers I've ever worked with understand this situation, and that all of them erroneously believe that they are, at all times, working with the 'real thing'. Nothing could be farther from the truth, so I implore all developers to take a moment and try to digest the difference between resources and their representations.

Thursday, June 14, 2007

Software Development Detox Part 4: Tribal Computing

One of the absolutely most heinous types of software intoxication is what I like to call Tribal Computing. This is the form of computing that is tied in with the primitive, proprietary setup and is a remnant of the old days of early adoption of computing by the businesses.

Ontogeny recapitulates Phylogeny

Ernst Haeckel (a German natural historian) wrote in 1868: “Ontogeny, or the development of the individual, is a short and quick recapitulation of phylogeny, or the development of the tribe to which it belongs.” (in this context, ontogeny is the development of the fetus, and phylogeny is the evolution of a species). Haeckel was referring to the way the fetal development of mammals seems to parallel the evolution of all life on earth. The fertilized mammalian egg first resembles a single-celled amoeba, then a multi-celled sponge, then a jellyfish, then an amphibian, then a reptile, then finally becomes recognizable as a mammal.

Applied to the field of computing, one could say that the development of computing seems to parallel the development of human society. At first, the society was segmented into small primitive tribes, which slowly evolved into larger units, such as cities, counties, provinces, regions, countries, nations, and so on.

We are today standing on a threshold of global computing (i.e. the web). But, in certain ways, we still seem deeply entrenched in the primitive, stone age world of tribal computing.

Think Locally

Most software in operation today is severely localized. Meaning, it was built to live on a single box, single CPU. Everything about it is very closed, very proprietary, and extremely local.

Some software allows for certain level of connectivity, whereby other software systems are given an opportunity to connect to the localized software and share/exchange some information.

Only the latest, most recent batch of software products (the so-called social software) has left the world of tribal computing and is reaching out to the global computing space.

Control Locally

Most tribal software was/is built with an engineering frame of mind. Whenever we approach building something with an engineering outlook, we are striving to introduce maximum level of control into the system.

One of the most detrimental side effects of building software with an engineering slant is the temptation to retain the state of the conversation. As we've seen in our first installment (State), the best way to create brittle and buggy software is to insist on retaining the state of the conversation that had transpired during the operation of the software product.

In addition to that, insisting on staying local (i.e. tribal, single box, single CPU etc.) means that the point of control also stays tribal. There is a single authoritative instance that claims to know everything and that controls what can and cannot happen on the system. That instance then becomes a single point of catastrophic failure.

Relinquish the Control Globally

In contrast, non-tribal software exhibits stunning capabilities for growth thanks to relinquishing the rigid engineering attitude. One of the fundamental reasons why web is such a spectacularly successive computing platform lies precisely in this abandoning of the tribal past and moving beyond the need to control and retain the state of the conversation.

By deciding to not care about the state of the systems engaged in the conversation, the non-tribal, globally oriented software is free to grow in any direction and to scale to any level of complexity.

We will see in the next installment what are the most optimal ways to achieve that level of robustness. Stay tuned.

Tuesday, May 29, 2007

Software Development Detox Part 3: Join the Conversation

So far we've established that the unsophisticated software solutions tend to be brittle and unreliable due to the following erroneous decisions:

Keeping track of the state of the conversation between the client and the server
Choosing to expose services through messaging interfaces (a.k.a. Remote Procedure Calls)

Today we're going to look into the minimal requirements for establishing the platform for enabling successful implementation of the business computing. We will see how important it is to separate the state of the conversation, that may have transpired between two or more business parties, from the transformations of the state of the participating parties. Also, we'll look into the advantages of abandoning the messaging interfaces when conducting business conversations.

Sustainable Business Strategies

Abandoning the world of software for a moment, let us examine the strategies that facilitate success in the world of business. Experience has shown that short term success is certainly achievable by exclusivity (i.e. by locking customers into some sort of a proprietary business model). However, for achieving any level of sustainability, businesses must, willy-nilly, open up and introduce free choices and embrace diversity.

In order to attain happy equilibrium of a sustainable business growth, one component of the business interaction must be optimized -- conversation. Businesses thrive on open-ended conversation. What that means is that while initially only two privileged parties may begin business conversation, other interested parties should be able to join and participate at any point.

For that to happen successfully, the barriers to entering the stream of conversation must be placed extremely low. In practical terms, anyone interested in joining the conversation should not be expected to learn any arbitrary set of rules, any new language, or any new protocol.

In addition to this fundamental requirement, parties joining the conversation midstream should also be able to easily examine the history. Anyone joining the conversation should be able to immediately gain access to the 'minutes' of all the previous conversations. That way, an interested party could, at their own convenience, examine the minutes and learn about some vital stats of the ongoing business transactions.

These two (the ability to join the conversation at any point in time and the ability to examine what had transpired up until that moment) are the most vital and essential requirements for conducting successful, sustainable business transactions.

Sustainable Business Computing Strategies

When it comes to achieving sustainable levels of business computing, the challenges tend to be even greater. Computing infrastructure and practices, that are being placed between various business parties interested in joining the conversation, serve to raise the barriers to entry.

Still, the promise of information technology is to make those barriers even lower than they tend to be for the non-digital business interactions. And in all truthfulness, that really is the true calling of this technology. So, the question then is: what went wrong?

The simplistic answer to the above question is that someone along the lines made a couple of wrong decisions (see the erroneous decisions listed above) and created a complicated situation resulting in businesses being unable to join interesting and potentially profitable conversations that are happening online.

Our job, then, is to remedy this situation, and to bring the technology back to the level where pretty much any interested business party is capable of easily joining the conversation. Not only that, but any party should qualify for easily examining the history of the conversation by replaying the 'minutes' of what had transpired while they were away.

Simplifying the Medium

In the world of information processing, bits and bytes could represent anything. But in the world of business computing, the medium that is of most interest is text. So in most situations, what is of particular business interest is content that is rendered as text.

Of a much lesser significance, but still figuring rather prominently in the world of business computing, are images. Less prominent media types would be audio and video.

So the viable computing platform that would fully facilitate successful business conversations is mostly focused on the content rendered as text. That content should be marked up in order to achieve certain level of semantic order. But the markup shouldn't be blown out of proportions.

Simplifying the Protocol

Once the medium for exchanging business content is sufficiently simplified, we must make sure that the channels for communicating that content are also fully simplified. Instead of expecting any interested party to learn the intricacies of a foreign language that is unilaterally enforced by the vendor, we must offer a severely restricted list of possible actions that are publicly vetted and extremely non-volatile.

For any resource that the businesses may be interested in, we can safely establish that only four actions would suffice when it comes to maintaining that resource:

Add resource
Remove resource
Modify resource
Fetch resource

The above four-pronged protocol is sufficient for conducting any business conversation. No further elaboration on the business computing protocol should ever be required (there is no need for any kind of messaging interfaces).

Joining the Conversation

When an interested party learns about the ongoing business conversation and wants to join in, all it has to do is fetch the state representation of the resource in question. From that point on, the interested party is free to propose any transition of the state of that resource. In addition to that, the interested party is also free to replay the conversation as it had unfolded prior to joining the conversation.

The above scenario is universal, and it applies across the board. Following this model, we can ensure that any business parties will have a successful transition to participating and contributing to the ongoing stream of business conversations.

Software Development Detox Part 2: Remote Procedure Calls

We've seen in our first detox installment that it is detrimental to maintain the state of the conversation between participating parties. The conversation that may have occurred between the interested parties and a targeted resource may have affected the state of the targeted resource. But that doesn't mean that the various stages of the conversation itself need to be retained.

Today we'll look into another form of software intoxication; this one has to do with how is the conversation implemented.

Typically, when two or more parties get engaged in a conversation, they tend to accomplish successful conversation by sending each other specific messages. For example, if an airplane is approaching the airport, the messaging between the pilot and the control tower gets peppered with code words such as "roger" and "over". That way, the conversation protocol gets established in the attempt to minimize the noise and maximize the signal.

In a loosely coupled world of business computing, free market forces dictate that many solutions providers can compete for the most successful solution. That situation creates a wild diversity of proposed conversation protocols. What's happening is that basically any vendor (i.e. solution provider) is free to come up with a completely arbitrary set of formal rules outlining how is the conversation going to take place. That creates a hell of a noise in the solution space, resulting in brittle and faulty implementations.

The culprit most of the time boils down to the methodology known as Remote Procedure Call (RPC). The remote procedure proposed by the vendor is a completely arbitrary, unilaterally constructed expression that the vendor expects the consumer to learn. In addition, the vendor reserves a unilateral right to change the expression encapsulated at the RPC level, and the consumer has no recourse but to re-learn the intricacies of how to talk to the vendor.

This unfortunate situation creates endless problems and headaches. All the consumers of the proposed vendor services are expected to keep an ever growing inventory of those custom Remote Procedures, and are on top of that left vulnerable to the future changes of that inventory.

Because of that, the effective detox therapy strongly advises against using the RPC model when architecting and designing your software solution. We will see in the next installment what is a much better way to go about accomplishing the robust software architecture.

Saturday, May 26, 2007

Software Development Detox Part 1: State

In the first installment of the software development detox program, I am going to review misconceptions and misunderstandings of one of the most fundamental software concepts -- state.

The problem with understanding state in the context of information processing/software development arises when people fail to recognize and acknowledge that there are two distinct aspects of state in the arena of software development. By bundling up these two aspects, people end up projecting an incorrect picture and consequently paint themselves into a corner by choosing the unsuitable architecture.

I am now going to (temporarily) abandon metaphors (such as 'paint oneself into a corner' etc.) and switch to using simple, albeit somewhat exaggerated examples.

Software Types

In this example, I am going to review a common occurrence of a typical software construct, such as date. Date is an abstraction devised to encapsulate and express human concept of time. In the world of information processing, we use software constructs, such as types, to encapsulate and express abstractions such as calendar date.

Suppose someone offers a software abstraction (i.e. type) called CustomDate. This abstraction is supposedly capable of doing accurate date calculations, and is endowed with certain conveniences. One such convenience being the ability to express, in calendar terms, such human concepts as 'tomorrow', 'yesterday', 'next week', etc.

So we see that this type is capable of certain behavior (such as being able to answer the question 'what date is tomorrow?', etc.) But, in addition to discernible behavior, software types typically also possess state. For example, our CustomDate may possess a state of knowing what date is year-end.

This state may change (different corporations have different year-end dates). And the instance of the type is expected to remember the changed state.

What can you say and how it gets interpreted

Upon acquiring a new software type, such as CustomDate, we will be expected to learn about its capabilities. We are not expected to understand how it is working. We're not even expected to understand all of its capabilities. We are free to pick and choose.

For example, if the CustomDate possesses 50 different capabilities, and all we want from it is the ability to tell us what date is the year-end, we should be able to safely ignore the remaining 49 capabilities.

To violate this basic agreement would result in creating brittle, unreliable software. Here is one fictitious example that illustrates this problem:

If we instantiate CustomDate and assign that instance a handle such as customDate, we should then be able to talk to that instance. If we are only interested in learning about our company's year-end date, we can send a message to our customDate instance, as follows:

customDate.year-end

In response to receiving that message, the customDate instance will return the actual year-end date to us.

The above described scenario should always yield the same behavior. There shouldn't be any surprises in how an instance of customDate behaves upon receiving the year-end message. If there is even a slightest possibility that the established message may render different, unexpected behavior, our software is not only brittle, but extremely buggy.

By now you may be wondering how could there be a possibility that the above scenario ever yields any different behavior than expected? Let me explain with another example:

We've learned so far that, when dealing with an instance of customDate, we can say year-end and it will be interpreted as a question that reads: "could you please tell me what is the year-end date?" Consequently, the representation of the correct year-end will get rendered and served as a response to our query. We've thus realized that an instance of customDate has state. That state (i.e. the actual value of company's year-end date) is the only state we're interested in when dealing with this software construct.

However, as we've mentioned earlier, this software construct may have 49 other capabilities and states, of which we know nothing. Now, the fundamental principle of software engineering dictates that we are absolutely not required to know anything about any additional, extraneous states and behaviors that a software construct may bring to the table.

Regardless of that prime directive, people who are not well versed in designing software solutions tend to violate this dictum on a daily basis. The way to violate the prime directive would be to introduce certain state/behavior combo that will modify how the question gets interpreted. One can imagine how easy would it be to add a capability to CustomDate which will turn it into a currency conversion utility. This example is admittedly unrealistic and exaggerated, but I chose it to illustrate the foolhardiness of arbitrarily assigning various capabilities to a software construct.

In this example, an overzealous green developer may add a capability to CustomDate that will put it into a "currency conversion" mode. If someone else is using the same instance of CustomDate and puts it into this "currency conversion" mode, that change in its state may modify the behavior of an instance of CustomDate, rendering the response to the

year-end question unintelligible.

Let's now run this hypothetical scenario:CustomDate gets instantiated as a resource on the server

A message arrives from a client asking the resource to convert 100 USD to Canadian dollars

An instance of CustomDate (i.e. customDate) puts itself into the "currency conversion" mode and renders the proper currency conversion

The client then sends a message to customDate asking it for a year-end

The instance renders an answer that corresponds to the value of 100 US dollars at the year-end
The above answer at step 5 comes as a complete shock to the client who asked for a year-end; the client wasn't aware that the instance can be shape-shifting and consequently may not always be returning dates when asked about the year-end.

In other words, what you can say and how it gets interpreted changes based on the state that an instance of the type may be in. A very bad situation, guaranteed to render that particular software program dysfunctional.

Statelessness
We can see from the above example how disastrous it can be to attempt to manage the state of a resource. In our case, we've been managing the state of an instance of

CustomDate, keeping track of when is it a date rendering machine, and when is it a currency conversion machine.

This tracking of the state resulted in the breakage of the working code. If we had abstained from keeping track of the state of the instance, the problems wouldn't have emerged in the first place.

From this we see that the only way to achieve robust and reliable software is to ensure that its constituent components are stateless. No memory of what had transpired during previous conversations should be retained.

However, keep in mind that we must distinguish here two types of states:

Entity state
Conversation state

It is this conversation state that is troublesome. Entity state is perfectly valid, and should be memorized. In this instance, entity state would be the fact that our company's year-end is October 31.

Keeping track of what transpired as clients have interrogated an instance of a software component, and then retaining that state, is always disastrous. And yet that is how most inexperienced software developers tend to architect and design their software.

Coming up

In the next installment, we'll look more closely into how to architect and design stateless software.

Monday, December 25, 2006

Contracts and Coupling in Resource Oriented Architecture

We've seen that most software developers in operation today are besotted by the vendor model. One of the symptoms of that malaise is the deeply ingrained idea of a contract. Basically, it would appear that, in the minds of typical software developers, there cannot possibly occur an interaction between software components without a predefined contract.

Of course, the contract they are talking about follows (albeit covertly) the implicit revenue model as ingrained in a typical software vendor's business plan. Software vendors make their living by locking into a contract with their customers.

But in cases where the utilization of the software product is not tied with any revenue model, the notion of a contract tends to lose its appeal. Like, for example, in cases where an open source community is bettering the services by utilizing the publicly vetted web standards and protocols.

There are no Unilateral Contracts

The web is a platform that publishes its capabilities in a unilateral fashion. Similar to any other publicly vetted service, the web is calibrated to serve the least common denominator.

For example, in all developed countries in the world there is a publicly vetted service called traffic. Various vehicles are free to enjoy the commodities that the public roads offer, mostly free of charge. However, the reason this service is made so freely available without strings attached lies in the fact that it comes with an extremely pronounced and unbendable bias. That bias states that all vehicles moving along the public roads must keep to the right side of the road.

If this system wasn't calibrated in such a way, it would be impossible to offer the amenities for conducting the public traffic.

So we see that the utmost flexibility of an infinitely complex system, such as public traffic, is only possible if that system is heavily calibrated in a non-negotiable fashion.

And because that calibration is non-negotiable, it does not qualify as being called a contract. In other words, if you're driving your car down the road, you're not holding the traffic contract in your hands that outlines all the details that bind you and the road owner. The road owner (i.e. the public) dictates the rules, and does so unilaterally. You have no say in whether you accept to keep to the right side of the road while driving. You simply must obey the unilateral rule, or suffer the consequences.

To put it more mildly, you must 'get used to it'.

Same principles apply when you're 'driving' around the web. The web specifies upfront the non-bendable rules and regulations, and you have no say in whether you like them, don't like them, whether you're planning to obey them or disobey them. The web doesn't expect you to make your stance known, and the web absolutely doesn't care.

Because of this unilateral set of rules, there cannot be a contract between the sever software on the web and the web client.

The Web is Uncoupled

Because of the lack of a bilateral contract between a web server and a web client, it is not possible to talk about any form of coupling on the web. Tight or loose coupling, which are terms tied with the world of software vendors' applications, lose any meaning on the web.

There are two reasons for this change:

On the web, the server need not know anything about the client
On the web, the server is blissfully unaware of the existence of any other server

All that is known on the web is that it is a sprawling collection of resources; these resources are easily accessible via the publicly vetted protocol (i.e. URI), and these resources know how to represent themselves and how to make their own state transition upon request.

Thanks to this particular bias, the web is the most scalable, the most interoperable, and the most robust, resilient and comprehensive computing platform ever.

Saturday, December 23, 2006

Resource Oriented Architecture and Quality

In his recent post exposing imaginary limitations of the REST paradigm, Udi Dahan laments the lack of explicit dialog that would discuss the benefits of adopting REST. More specifically, he writes:

"What I haven’t heard is how I can use it (i.e. REST) to build large-scale distributed systems that are controlled by complex logic."

Here, I believe he is talking about his definition of quality. Building large-scale distributed systems that are controlled by complex logic sounds like a very costly proposition. Something one would not attempt just for the heck of it, just to kill a rainy weekend.

Because such a project entails a price tag of hundreds of thousands, if not millions of dollars, it is not to be taken lightly. Also, it is not something that happens very often. I would venture out to say that most of us have never been involved in such a project, nor will we ever find ourselves involved in such a thing. It is not very often that businesses embark upon building such complex distributed systems. Yes, eBay and Amazon etc. are such systems, but you could almost count these beasts on the fingers of one hand.

So the question that mystifies me a bit here is: why? Why worry about building such a large-scale distributed complex system? What's the business case for it? How many of those are you planning to build next year? What's your compelling reason for going after such a risky venture?

But assuming that Udi has his valid reasons, and that he has millions of budget dollars to back him up (which, frankly, I seriously doubt), let's engage him in his quest, for the sake of having fun.

What I'm assuming here is that, to Udi, large-scale complex distributed systems spell quality.

So You Want to Build a Multi-million Dollar System?

Why would a system we're building end up costing millions of dollars? There are several possible reasons:

We are using primitive technology, and thus it's taking us too long to build
We don't understand what is the system supposed to be doing, so we're making things up as we go
The scope of the system is on a runaway curve, and so we end up with explosion of features (i.e. featuritis)

In the case of primitive technology, we can imagine trying to dig the Panama canal by giving each worker a wooden spoon. That technology would be considered very primitive for the intended job, and would therefore result in the astronomical cost of the project.

Similarly, if we attempt to build an equivalent of amazon.com or ebay.com using assembler, we'll end up paying astronomical price for using such primitive technology.

In the case of not understanding the goals that the system under construction is supposed to deliver, 'fishing for results' would invariably turn out to be prohibitively expensive.

Finally, in the case of scope runaway, featuritis and bloat are well known for being insanely expensive.

Let's say Udi is unphased after reading this, and still wants to proceed with building his multi-million dollar large-scale complex distributed system. He is now curious how can REST and ROA be of any benefit to him. Let's explore.

Can REST and ROA Help?

My knee-jerk answer is: not really. But let's not give up so easily.

First, let's make two things clear:

If we don't understand what is the system supposed to be doing, then no amount of technology can salvage us from dying a miserable death
If we contract the featuritis disease and start bloating, there equally is no technological cure; we'll be doomed to collapse under our own weight

So let's assume here, for the sake of having fun, that Udi is immune from the two forms of malaise mentioned above. Let's say he has a very clear understanding of the goals of the large-scale distributed complex system he is trying to build, and let's also assume that he is immune from the featuritis and bloat (both assumptions being extremely unlikely, by the way, but let's pretend they've been licked).

Seeing a Problem where there isn't Any

OK, so Udi is concerned about the following:

"There are business rules that control which data should be returned and when. All this has been known for quite a while and has been modeled around transaction isolation levels in the OLTP sense, and needs to be handled at the application level when it comes to intelligent caching. I haven’t heard anything about REST that deals with this."

My question is: why on earth should REST be expected to deal with that? REST and ROA, as their names imply, specialize in dealing with Resources and their Representations. They don't specialize in dealing with the infrastructure that is the underpinning for the resources.

For all we know, that infrastructure might very well be implemented as OLTP, or perhaps as something else. Same as the ability of the RDBMS to index the data stored within it might be implemented using various algorithms. But in reality, who cares? The important thing is that it works and meets our expectations.

Next, Udi asks:

"If, as a result of updating one entity, other entities are affected how is that information propagated back to the caller?"

This is simple: the caller consumes resources by dealing with their representations. This is the essence of REST. The ground rule in ROA is that raw resource can never be touched by the client, or by the caller. All one is allowed to work with is the resource's representation.

Udi talks about updating an entity. I'll try and translate this into ROA speak. Instead of updating an entity, in ROA we speak of requesting a state transition for a resource. The way it works is as follows:

A resource gets identified (via the URI)
That resource then gets located (via the corresponding URL)
A request is then sent to that resource to change its state (i.e. to make a state transition)

At the end of that chain of events, other resources (or, 'entities' in Udi's parlance) may or may not get affected.

His question, then, is: how is that fact (i.e. that other resource got affected during the state transition) propagated back to the caller?

The answer is glaringly simple: it all depends on the use case. If the use case requires that the caller be notified of the state transition, then the appropriate representation of the new state will be presented to the caller in the form of a response.

One really feels like stopping at this point, and asking: what's the big mystery? Why is it that all these highly paid software architects are having such hellish time grasping this extremely simple concept of resources, their states, and their state transitions?

I am indeed forced to conclude that they seem to working really hard on trying to see the problem where there isn't any. Is that why they get paid those big bucks?

Here's the Skinny

Udi even gives us a tiny homework:

"Let’s say I have a simple rule: When the sum of all orders for a customer in the past quarter increases above X, the customer is to become a preferred customer. Do I PUT the new order resource to the customer resource, or should I have a generic orders resource that is “partitioned” by customer Id? Once the change to the customer occurs, should I PUT some change information somewhere so that the initiator can find out where it should go look for the change? And how should I be handling transactions around all these resources?"

Again, he is confusing apples with oranges. Let's break it down gently:

A customer is a resource. We don't know how is that resource implemented behind the firewall, and we really, really don't want to know! This point is crucial to understanding ROA. Please keep in mind that ROA is definitely not about the how.

OK, we now need to understand that a resource (i.e. customer) has state. For example, the resource could be in the state of being a preferred customer, or not being a preferred customer. However, we don't know, we have no idea how is that state implemented. Also, we don't want to know anything about those gory details.

On to how does that state change. Recall that REST stands for "Resource State Transition" (actually, this is incorrect, and this section talks about ROA, not REST; please see the correction notice at the end of this post). So 'state transition' is the crucial phrase here. Obviously, a resource is capable of making a transition from one state to another. Such as making the transition from the state of not preferred customer to the state of preferred customer.

How/when does that state occur, you ask? Well, that transition is driven by a particular event that must be described in the use case scenario. Like in Udi's use case above, in case of an event when the resource's state (such as total value of accumulated orders) reaches a predefined threshold, that resource knows how to make the transition from not being the preferred customer to becoming a preferred customer.

Again, we don't know how is this transition taking place. There are millions of conceivable ways that transition could take place, but the important thing is that it does.

All of Udi's questions surrounding this hypothetical scenario boil down to how, but that's largely irrelevant. Somehow the resource knows the proper way to make the transition from one state to the other state. When the client then makes a request for getting that resource, the resource in question will know how to represent its state, including the newly won 'preferred customer' feather-in-the-cap.

Correction: I've just realized that, in my eagerness to explain the encapsulation that Resource Oriented Architecture offers, I've made a mistake and erroneously ascribed state transition to REST. Actually, REST stands for Representational State Transfer, not Resource State Transition. The latter is a characteristic of ROA, not REST.

Sorry for the inadvertent error.

Thursday, December 21, 2006

RESTafarian Jihad

I've just been accused of waging a RESTafarian Jihad. I feel that this accusation is unjust, and that it is uncalled for. So I need to explain and defend my case here.

Jihad is a strong word in this context, because it came to mean 'holy war' in the modern parlance. And I'm nowhere near waging a holy war on anything, least of on the non-REST platforms.

To be more specific, I am absolutely not interested in converting the Remote Procedure Call (RPC) people to REST. True, approximately 98% to 99% of all software developers I've ever met cannot imagine building a distributed software system without utilizing some form of RPC, so for these people, Resource Oriented Architecture (ROA) is meaningless.

But as I've already explained in one of my recent posts, I view these people as a write-off. Basically, I think there is nothing we can do about that, and that it's time we cut our losses and move on with our lives.

So you see, I'm leaving the RPC crowd (which means almost all software developers in existence today) alone. I'm not waging any Jihads, I'm not plotting to convert anyone.

Instead, I strongly believe that, as it has always been the case, the simpler system, with lower barriers to entry, will somehow win in the end. I don't really know how, but God's ways are mysterious, so we'll see. But same as CORBA had to disappear because it was just too unwieldy, web services and SOA will also be forced to vanish from the scene.

At that point, ROA will take the lead. Right now, we're preparing for that portentous event. It will come in its own sweet time. We're not rushing anything.

We know that, eventually, people will abandon the ways of the past century, and will embrace the twenty first century ways of employing high technology.

Friday, December 15, 2006

Are Software Developers a Writeoff?

My friend Ted introduced me to the idea that we won't be able to see the real effects of the web until the time when the last person who can remember the days before the web dies out. In his view, if you can remember life before the web, then you're not capable of fully grasping the web.

I tend to agree with him. The web is such a drastic paradigm shift that we're not really capable of fully grasping its significance.

But the young and upcoming generations, who can't remember the days without the web, will be much better equipped to cope with it. Same as I take electricity and indoor plumbing for granted, my children take the web and the perpetual availability of the wireless connectivity for granted.

And that makes them a different kind of persons than I, their father, am. And I find that to be a very fascinating phenomenon.

New Breed of Software Developers

Similar to how my children are a new breed of humans who have a completely different outlook on the web than I do, there is an upcoming new breed of software developers who are diametrically opposite from the existing, entrenched batch of software developers.

The existing developers, as I've been discovering through gruesome series of interviews lately, seem completely oblivious to the incredible powers of the web. All the developers I talk to seem to think that the only way to accomplish something in the software development arena is to use the function calls. They all follow the software vendors' lead, which typically means more complexity, more heavy lifting, less palatable solutions.

Because this tendency is so pervasive and ubiquitous among the existing software professionals, I have reached a point where I'm pretty much ready to throw the towel in. In other words, I'm this close to call all your typical software developers, all those entrenched Java, J2EE, .Net, Oracle etc. developers a big fat writeoff.

I am slowly starting to think that this workforce, as huge as it is, is pretty much useless when it comes to business computing. Some of them may be good for developing certain components of the computing infrastructure. But as soon as we get to the point of utilizing this infrastructure for the benefits of the business, they all turn to be nothing but a writeoff.

And by 'writeoff' I mean more harm than good.

This is why I'm forced to turn my attention to the young and upcoming generation of software developers. We need to ensure that these young people don't get tainted by the tool vendors' agendas and business models. We need to make sure they don't get caught in the remote procedure call hell.

Resource Oriented Architecture and Vendors

Suppose we get to the point where Resource Oriented Architecture (ROA) becomes prominent and gains wide adoption. Given the ubiquity, pervasiveness and popularity of the web, it is not a far fetched proposition. After all, the native web is strictly resource-based, and a Resource Oriented Architecture that utilizes that orientation seems like a perfectly natural fit.

Now, if this re-orientation happens, what will the vendors do?

Vendor Business Model

Software vendors are in the business of making profit from selling software based solutions. Traditionally, the most efficient solution, from the vendors' profitability point of view, tends to come in the form of a tool. This is why we tend to call these vendors 'tool vendors'.

Because of this orientation, vendors tend to favor system architectures that are complex, convoluted, difficult to fathom and master. Of course, anything can be quite easily made more complicated than it already is. In other words, it is quite easy to muddle the issue, to add complexity, to increase brittleness. You don't have to be very smart of very competent in order to make things more complicated.

As a matter if fact, complicated situations are always a surefire sign of stupidity. If, on the other hand, you find systems that work but are nevertheless very simple and easy to navigate, that invariably means that someone quite intelligent has devised such a solution.

But, since an easy to understand, easy to navigate and use system is non-profitable from the vendors' point of view (i.e. such systems do not require 'tools' in order to be utilized), vendors do not see any value in playing along with them. Vendors prefer murky, complicated architectures. Getting you involved with such ungodly systems and architectures is a very lucrative proposition for the vendors, because then they have a very good justification to approach you and to start selling you their tools.

Do We Need Tools?

Take a practitioner of a well established profession, such as a lawyer, or a mathematician. These professionals deal with extremely elaborate, intricate systems and bodies of knowledge. They are expected to navigate very sophisticated waters of highly formalized systems.

But what kind of tools do they need in order to do their jobs? In reality, very few. For a lawyer may need nothing more than a typewriter, and a mathematician actually needs even less tools -- a paper and a pen will do just fine.

How come? It's simple -- their job is largely conceptual by nature.

Now, the question is: how is that different from software development? Why do we think that software development is not conceptual by nature, in ways similar to mathematical development?

The thing is, software development is not different in any significant ways from mathematical, or legal developments. They are all largely conceptual activities.

And realistically speaking, all these conceptual activities don't need many specialized tools. It is a great fallacy, perpetrated on us by the tool vendors with their business agendas, that software development needs an elaborate set of tools.

Choose the Right Foundation

And since we are dealing with a fairly conceptual activity when developing software, it is of crucial importance that we select the right foundation, the right philosophy, and the right paradigm from where we can operate successfully.

Choosing the right starting point will ensure that we don't get entangled in the morass of complexity, brittleness, analysis paralysis and the like. Otherwise, it would be devilishly easy to slip and end up in the blind alley.

If we now review the philosophy and the paradigms that the tool vendors are offering us, we see that they prefer to side with complexity, heavy lifting and hard work. The vendors are proposing we adopt systems that are architected for ever growing complexity. Theirs is the business model of introducing a big hairy problem, and then selling the solutions that would be pain killers.

So what's the alternative? We propose that instead of aiming at acquiring pain killers from the vendors, we focus our efforts on acquiring vitamins.

And in our vision, vitamins would be the publicly vetted standards upon which the web, as a universal computing platform, is built. The web offers us simple, extensible, scalable, interoperable and linkable architecture. As such, it is very beneficial to us in the sense that it does not introduce the pain of complexity which then needs to be handled with pain killers.

Wednesday, December 6, 2006

It's Not the How, it's the What

By now, we all know the intentions are older than implementation. There were times when foolish software developers used to think the reverse was true, but thankfully today most of them have been browbeaten into aligning with the program.

So everyone seems to agree nowadays that articulating the what (i.e. the intention) is much more important than articulating the how (i.e. the implementation). So where's the problem then?

Web Development Challenges

While every prudent software developer had by now learned the lesson in separating the what from the how, they still seem to be applying it at too low a level. Similar to how the budding software developers of the olden days used to build those monstrous huge bloated monolithic programs that ended up being totally unwieldy and unmanageable, today's web developers are falling into the same trap of building similarly huge, bloated, unwieldy web sites.

Or, let me put the question up to you like this: what would you rather build -- a huge monolithic tightly coupled software program, or an open-ended set of smallish, self-contained, loosely coupled software units, that are easy to implement, easy to test, easy to troubleshoot and debug, easy to deploy?

Of course, if you are a software developer worth even a tiny bit of your own salt, you'd go for the latter. Some of us still recoil in horror when the flashbacks of bad old days of monstrous monolithic software programs come back to haunt us. Ugh, not a happy memory!

But it's really funny how these time honored principles of separation of labor, inversion of control, etc., do not seem to translate at all when being ported to the web development. Here, when developing web sites, budding software developers are more than happy to fall into the same old trap of indulging in muddling the concerns and seeking a single point of control. Which ends up in incredibly messy, monolithic, coupled and unmanageable web sites.

This, again, is the outcome of blindly rushing in to grab the how! How do I do such-and-such in Rails? How do I do such-and-such in Ajax?How do I do such-and-such in Flex? And on and on...

People never seem to stop to think what. What do I need to do in order to make this work? Forget about how to do something, and focus on the what.

How you do something is really not a big deal. 99 percent of the time it's just a short google search away. So why sweat the small stuff?

The what is much tougher. Google will not be able to help you there. You'd need your own cool head, your own common sense to figure that one out.

The surefire sign that you're charging down the wrong path is if you find yourself building large monolithic web sites. You know the ones that rely on the function calls, with one central point of control? The ones where it gets progressively difficult to introduce any new capability?

You know the ones I'm talking about. I bet you that right now you're working on one such site.

Well stop! Same as you wouldn't work on a single humongous big ass class, but would rather chop it up into dozens and dozens of smaller, more manageable classes, you must do the same with that web site. Abandon it. Chop it up. Give up the single point of control, give up the irresistible need to shoehorn everything through the single funnel of omniscient point of control.

Think back to your other software development skills. Remember how tough it was until you've learn how to properly delegate?

Well, now's the time to do the same delegating, only not with objects, but with web sites. Treat each web site as a single independent component, that specializes in offering some useful resources. Then open us the traffic gates. Let those independent, standalone 'components' interact, and see what happens.

See, I gave you the what; the how should be a piece of cake for you now.

Tuesday, December 5, 2006

Web Model for Business Computing

After reviewing the Vendor Model for Business Computing, it is now time to look into the web model. To refresh your memory for a moment here, the fundamental differences boil down to how each model treats data vs business logic.

As we've seen, vendor model tends to freely expose the data, while jealously hiding the business logic. It operates on the fundamental premises that data is deadwood that can only be animated thanks to their marvelous business logic. Without their magical proprietary processing logic, the data remains useless and thus worthless.

Consequently, vendor model insists that we pay good money to gain access to their beloved processing logic.

Web Model is for Humans

Unlike the vendor model, the web model was built with human beings in mind. Instead of insisting that some cryptic, arcane and privileged proprietary business logic be the animator of the data, web model leaves data animation to the most suitable subjects -- human users.

This is why products based on the web model have nothing to hide when it comes to the processing logic. They don't deem this logic very important. As a matter of fact, such products leave almost all of the decisions as to how one would like the data processed to the human users.

Which is where that decision rightly belongs, after all is being said and done.

Web Model Hides the Data

Quite surprisingly, as much as the web model liberates us from the slavery of being governed by some secretive proprietary processing logic, it is equally adamant that we don't gain access to the data.

Instead, web model insists that we only gain access to the representation of the data. But as to what the actual data looks like, or how does it really feel like, we remain forever mystified.

We'll talk some more in the upcoming series about why is such arrangement a much better thing for all consumers of software.

Monday, December 4, 2006

Web Model vs. Vendor Model

There are two models of business computing: vendor model and web model. Vendor model is presently the entrenched model, the one that everyone is locked in.

Web model is brand new. Ask any software developer about the web model of business computing, and they wouldn't know what that model is nor how does it work.

But that doesn't mean that the web model is irrelevant. Quite the reverse, it is the most significant thing to ever happen in the world of business computing.

How is the Web Model Different?

Perhaps the best way to understand the web model of business computing is to discuss how is it different from the vendor model.

Before we delve in, let's just quickly recapitulate what are the characteristics of business computing. You may recall that any business computing model must be based on three fundamental characteristics:

Identity
State
Behavior

In the business computing parlance, and for the purposes of our discussion, we can say that the state is equivalent to data, and the behavior is equivalent to the program logic.

Getting back to the difference between the vendor model and the web model of business computing, we will see that the two models fundamentally differ in the way they treat data and programming logic.

Vendor model insists on exposing the data and hiding the programming logic.

Web model, you guessed it, is the exact opposite -- it is based on the architecture that hides the data and exposes the programming logic.

Both models have their use, and in our ongoing series on the Resource Oriented Architecture (ROA, which is the underpinning of the web model), we will be discussing the advantages of the model that hides the data.

Stay tuned (and don't feel shy to ask any questions).

Saturday, December 2, 2006

Resource Oriented Architecture -- Growing Pains

I've already discussed the problems related to how traditional software development workforce seems incapable of grasping Resource Oriented Architecture (ROA). Today, I will tackle another of the misconceptions that seem to inevitably keep popping up in the software development circles.

Someone who very appropriately claims that adding simplicity is an engineering mantra, wrote about the Promise of SOA, REST, and ROA. Here, we will only focus on his view of ROA:

ROA is an intriguing proposition. Applications are freed from worrying about services at all. A resource, which is effectively an entity, exposes the state transitions it is willing to accept. Applications don't even care what the resource is, they simply decide whether they are interested in the state transitions available. The level of omniscient inference with this approach is difficult to even explain. So, if the resource is a tennis court and the state transition is reserve (is that a state change or operation?) an application can decide, without caring about the resource per se, that it wants to reserve it. Of course if this is a seat at a concert you may be committing to a financial transaction. At the very least, the state may change back without the applications knowledge (except it's omniscient so that isn't possible).

Let's pick this apart, to see if we could identify sources of confusion.

Applications and Services

The post quoted above states that: "Applications are freed from worrying about services at all." First off, it is important to understand that ROA is not about applications. This was hinted at in my post on the golden age of software applications.

ROA is, as its name implies, all about resources. Some people claim that it's basically the same as being all about services, but I strongly disagree.

Resources and Entities

He continues: "A resource, which is effectively an entity..." Maybe I should let this slide, but let me just point out that a resource is effectively a resource. No need to drag entity into the picture.

Acceptable State Transitions

Moving along: "[Resource] exposes the state transitions it is willing to accept."

This is absolutely incorrect. Resource only exposes its representation, not the state transitions it is willing to accept.

Applications and the Attempts to Simplify Reality

"Applications don't even care what the resource is, they simply decide whether they are interested in the state transitions available."

He is again talking about applications. And by 'applications' I suspect he means blobs of pre-compiled code.

Again, this is the old, vendor-dictated school. ROA, on the other hand, is the new, vendor-averse school.

Unlike the vendor-giddy applications, ROA does not waste time in the vain attempts to simplify reality. ROA recognizes and realizes that reality is unthinkably complex. And so ROA lives with it, is based on it, grows from it.

So ROA is not interested in any intricacies of the state transitions that are specific to any of the involved resources. The reason for this is that it would be absolutely impossible to be interested in such things, given their astronomical complexity and volatility.

ROA therefore throws the towel in at the very outset. It is non-competitive by nature.

Not Knowing is the Most Intimate

The author goes on: "The level of omniscient inference with this approach is difficult to even explain."

There is a famous Zen koan in which a Zen Master asks a question, and his student responds: "I don't know, Master".

"Not knowing is the most intimate!" replies the Master.

What is a State Transition?

Seems like traditional software developers are clueless when it comes to grasping the state transition model in ROA. For example, the author of the above post continues:

"So, if the resource is a tennis court and the state transition is reserve (is that a state change or operation?) an application can decide, without caring about the resource per se, that it wants to reserve it. "

There is no such thing as a state transition called 'reserve'. In Service Oriented Architecture (SOA), yes, one can envision such state transition. But never in ROA.

Furthermore, here is that pesky application again. Who is this application, on behalf of which vendor is it entering the picture, and how is it deciding what to do?

The State Changes

Finally, there is (at last!) one thing that both the old school and the new school of software development agrees upon: the state may change! Hallelujah!

"At the very least, the state may change back without the applications knowledge (except it's omniscient so that isn't possible)."

The above sentence actually doesn't make any sense. Who is omniscient? The application? Who made this application?

The whole point is that, when talking about resources, the old school developers are actually not talking about resources (the above quoted article illustrates this rather poignantly). Then they wonder out loud how come they don't get ROA?

All that the old school developers seem capable of thinking is applications. This belies their state of being brainwashed by the tool vendors.

I am fairly positive at this point that unless software developers manage to deprogram their brains and escape the tool vendors' programming (i.e. brainwashing), they will remain completely lost for the world of Resource Oriented Architecture.

Just as well.

Thursday, November 30, 2006

Resource Oriented Architecture and OLTP

As the Resource Oriented Architecture (ROA) is slowly but inevitably starting to gain momentum, there will inevitably be loads of misunderstanding popping up from all sides. Today, we'll look at the grievances posted in the Give it a REST article by someone's "Chief Architect" Udi Dahan.

Here is what Udi claims:

If REST wants to conquer the enterprise, it had better get down from its "internet-scale" high-horse and start talking in terms of OLTP.

This is like saying "if OLTP wants to conquer the enterprise, it had better get down from its RDBMS high-horse and start talking in terms of registries and bits and bytes."

There is always something that is at the lower level than the system under study. Just because there is a lower level, doesn't automatically mean we should stoop to it.

The entire point of the REST and the ROA is precisely the liberation from the need to stoop down to the lower level. Using these principles, one gains the necessary discipline to resist the ingrained challenge to stoop to the lower level.

In case this is not clear to you, keep in mind that REST stands for 'representational'. Representational means representing something that is at the lower level.

And since we have the representation to play with, we don't have to worry about the underlying lower level.

So, in this case, you can envision ROA being a representation of an OLTP system. Yes, we know that behind the ROA veneer there is an ugly OLTP system. But we don't want to really go there.

Just as we know that behind the OLTP system, there is an ugly flat file, assembler code system. And we equally don't want to go there.

So try and train yourself not to stoop to the lower level. Enjoy the privileges and the benefits of the freedom that ROA brings to the table.

Friday, November 24, 2006

Rails 1.2 Drives the Final Nail in the Database Coffin

This morning's announcement that Rails 1.2 is about to hit the streets signals the beginning of the end of the database as we know it. Providing that no major roadblocks be discovered and that the Release 1.2 candidate firms up and becomes a real product, we're looking at the most important advancement in software development since the invention of Ruby 10 years ago.

You Don't Have to Hold Your Breath Anymore

I've been advising my friends, associates and customers to hold off their transition to Rails until Release 1.2. The reason I was doing that is because I didn't want them to be forced to learn something and then unlearn it and relearn a completely different approach. This is now the perfect time to jump into Rails, hook, line and sinker. There is absolutely no reason to hold back anymore.

Resource Oriented Development Changes Everything

The reason I think that Rails 1.2 is a death knell for the databases lies in the fact that it brings full fledged resource oriented development into the mainstream.

To quote "that other guy" from Seinfeld (from The Dealership episode; KRAMER: Listen to me. When that car rolls into that dealership, and that tank is bone dry, I want you to be there with me when everyone says, "Kramer and that other guy, oh, they went further to the left of the slash than anyone ever dreamed!"):

"Things will be different from now on."

Why do I think things will be different from now on? You won't know until you try the resource oriented development for yourself. Same as with Ruby (and LSD), if you haven't tried it, you won't know what is all that fuss that people are talking about.

A very significant side effect of this kerfafel is that databases are from now on going to undergo progressive trivialization. It'll take a lot of time for me to explain why is this process inevitable -- I'll save you the time by suggesting that you go and install Rails 1.2 and see for yourself.