I'm still continuing my exploration of the philosophy behind building
distributed applications following the principles behind the REpresentational State architectural style (REST) and Web-style software. Recent comments in my blog have introduced a perspective that I hadn't considered much before.
Robert Sayre wrote
Reading over your last few posts, I think it's important to keep in
mind there are really two kinds of HTTP. One is HTTP-For-Browsers, and
one is HTTP-For-APIs.
API end-points encounter a much wider variety of clients that
actually have a user expecting something coherent--as opposed to bots.
Many of those clients will have less-than robust HTTP stacks. So, it
turns out your API end-points have to be much more compliant than
whatever is serving your web pages.
Sam Ruby wrote
While the accept header is how you segued into this discussion, Ian's
and Joe's posts were explicitly about the Content-Type header.
Relevant to both discussions, my weblog varies the Content-Type
header it returns based on the Accept header it receives, as there is
at least one popular browser that does not support
application/xhtml+xml.
So... Content-Type AND charset are very relevant to IE7. But are
completely ignored by RSSBandit. If you want to talk about “how the Web
r-e-a-l-l-y works”, you need to first recognize that you are talking
about two very different webs with different set of rules. When you
talk about how you would invest Don's $100, which web are you talking
about?
This is an interesting distinction and one that makes me re-evaluate
my reasons for being interested in RESTful web services. I see two main
arguments for using RESTful approaches to building distributed
applications on the Web. The first is that it is simpler than
other approaches to building distributed applications that the software
industry has cooked up. The second is that it has been proven to scale
on the Web.
The second reason is where it gets interesting. Once you start reading
articles on building RESTful web services such as Joe Gregorio's How to Create a REST Protocol and Dispatching in a REST Protocol Application
you realize that how REST advocates talk about how one should build
RESTful applications is actually different from how the Web works. Few
web applications support HTTP methods other than GET and POST, few web
applications send out the correct MIME types when sending data to
clients, many Web applications use cookies for storing application
state instead of allowing hypermedia to be the engine of application
state (i.e. keeping the state in the URL) and in a suprisingly large
number of cases the markup in
documents being transmitted is invalid or malformed in some ways.
However the Web still works.
REST is an attempt to formalize the workings of the Web ex post facto.
However it describes an ideal of how the Web works and in many cases
the reality of the Web deviates significantly from what advocates of
RESTful approaches preach. The question is whether this disconnect
invalidates the teachings of REST. I think the answer is no.
In almost every case I've described above, the behavior of client
applications and the user experience would be improved if HTTP [and
XML] were used correctly. This isn't supposition, as the
developer of an RSS reader
my life and that of my users would be better if servers emitted the
correct MIME types for their feeds, the feeds were always at least
well-formed and feeds always pointed to related metadata/content such
as comment feeds (i.e. hypermedia is the engine of application state).
Let's get back the notion of the Two Webs. Right now, there is the
primarily HTML-powered Web which whose primary clients are Web browsers
and search engine bots. For better or worse, over time Web browsers
have had to deal with the fact that Web servers and Web masters ignore
several rules of the Web from using incorrect MIME types for files to
having malformed/invalid documents. This has cemented hacks and bad
practices as the status quo on the HTML web. It is unlikely this is
going to change anytime soon, if ever.
Where things get interesting is that we are now using the Web for more
than serving Web documents for Web browsers. The primary clients for
these documents aren't Web browsers written by Microsoft and Netscape AOL Mozilla and bots from a handful of search engines. For example, with RSS/Atom we have hundreds of clients
with more to come as the technology becomes more mainstream. Also Web
APIs becoming more popular, more and more Web sites are exposing
services to the world on the Web using RESTTful approaches. In all of
these examples, there is justification in being more rigorous in the
way one uses HTTP than one would be when serving HTML documents for
one's web site.
In conclusion, I completely agree with Robert Sayre's statement that there are really two kinds of HTTP. One is HTTP-For-Browsers, and
one is HTTP-For-APIs.
When talking about REST and HTTP-For-APIs, we should be careful not
to learn the wrong lessons from how HTTP-For-Browsers is used today.