To follow up my post asking Is HTTP Content Negotiation Broken as Designed?, I found a post by Ian Hickson on a related topic. In his post entitled Content-Type is dead he writes

Browsers and other user agents largely ignore the HTTP Content-Type header, relying on undefined sniffing heuristics to determine what the content of a page really is.

  • RSS feeds are always sniffed, regardless of their MIME type, because, to quote a Safari engineer, "none of them have the right mime type".
  • The target of img elements is almost always assumed to be an image, regardless of the declared type.
  • IE in particular is well known for ignoring the Content-Type header, despite this having been the source of security bugs in the past.
  • Browsers have been forced to implement heuristics to handle text/plain files as binary because video files are widely served with the wrong MIME types.

Unfortunately, we're now at a stage where browsers are continuously having to reverse-engineer each other to determine why they are handling content differently. A browser can't afford to render any less content than a browser with more market share, because otherwise users won't switch, and the new browser will not be adopted.

I think it may be time to retire the Content-Type header, putting to sleep the myth that it is in any way authoritative, and instead have well-defined content-sniffing rules for Web content.

Ian is someone who's definitely been around the block when it comes to HTTP given that he's been involved in Web standards groups for several years and used to work on the Opera Web Browser. On the other side of the argument is Joe Gregorio who posts Content-Type is dead, for a short period of time, for new media-types, film at 11 which does an excellent job of the kind of dogmatic arguing based on theory that I criticized in my previous post. In this case, Joe uses the W3C Technical Architecture Groups (TAG) findings on Authoritative Metadata

MIME types and HTTP content negotiation are good ideas in practice that have failed to take hold on the Web. Arguing that this fact contravenes stuff written in specs from last decade or from findings by some ivory tower group of folks from the W3C seems like religous dogmatism and not fodder for decent technical debate. 

That said, I don't think MIME types should be retired. However I do think some Web/REST advocates need to look around and realize what's happening on the Web instead of arguing from an "ideal" or "theoretical" perspective.


 

Categories: Web Development

While you were sleeping, Windows Live Academic Search was launched at http://academic.live.com. From the Web site we learn

Welcome to Windows Live Academic

Windows Live Academic is now in beta. We currently index content related to computer science, physics, electrical engineering, and related subject areas.

Academic search enables you to search for peer reviewed journal articles contained in journal publisher portals and on the web in locations like citeseer.

Academic search works with libraries and institutions to search and provide access to subscription content for their members. Access restricted resources include subscription services or premium peer-reviewed journals. You may be able to access restricted content through your library or institution.

We have built several features designed to help you rapidly find the content you are searching for including abstract previews via our preview pane, sort and group by capability, and citation export. We invite you to try us out - and share your feedback with us.

I tried a comparison of a search for my name on Windows Live Academic Search and Google Scholar.

  1. Search for "Dare Obasanjo" on Windows Live Academic Search

  2. Search for "Dare Obasanjo" on Google Scholar

Google Scholar finds almost 20 citations while Windows Live Academic Search only finds one. Google Scholar seems to use sources other than academic papers such as articles written on technology sites like XML.com. I like the user interface for Windows Live Academic Search but we need to expand the data sources we query for me to use it regularly.


 

Categories: Windows Live

Working on RSS Bandit is my hobby and sometimes I retreat to it when I need to unwind from the details of work or just need a distraction. This morning was one of such moments. I decided to look into the issue raised in the thread from our forums entitled MSN Spaces RSS Feeds Issues - More Info where some of our users complained about a cookie parsing error when subscribed to feeds from MSN Spaces.

Before I explain what the problem is, I'd like to show an example of what an HTTP cookie header looks like from the Wikipedia entry for HTTP cookie

Set-Cookie: RMID=732423sdfs73242; expires=Fri, 31-Dec-2010 23:59:59 GMT; path=/; domain=.usatoday.com

Note the use of a semicolon as a delimiter for separating cookies. So it turned out that the error was in the following highlighted line of code


if (cookieHeaders.Length > 0) {
container.SetCookies(url, cookieHeaders.Replace(";", ","));
}

You'll note that we replace the semicolon delimiters with commas. Why would we do such a strange thing when the example above shows that cookies can contain commas? It's because the CookieContainer.SetCookies method in the .NET Framework requires the delimiters to be commas. WTF ?

This seems so fundamentally broken I feel that I must be mistaken. I've tried searching for possible solutions to the problem online but I couldn't find anyone else who has had this problem. Am I using the API incorrectly? Am I supposed to parse the cookie by hand before feeding it to the method? If so, why would anyone design the API in such a brain damaged manner?

*sigh*

I was having more fun drafting my specs for work.

Update: Mike Dimmick has pointed out in a comment below that my understanding of cookie syntax is incorrect. The cookie shown in the Wikipedia example is one cookie not four as I thought. It looks like simply grabbing sample code from blogs may not have been a good idea.:) This means that I may have been getting malformed cookies when fetching the MSN Spaces RSS feeds after all. Now if only I can repro the problem...


 

Categories: RSS Bandit | Web Development

In a recent mail on the ietf-types mailing list Larry Masinter (one of the authors of the HTTP 1.1 specification) had the following to say about content negotiation in HTTP

> > GET /models/SomeModel.xml HTTP/1.1
>
> Host: www.example.org
>
> Accept: application/cellml-1.0+xml; q=0.5, application/cellml-1.1+xml; q=1

HTTP content negotiation was one of those "nice in theory" protocol additions that, in practice, didn't work out. The original theory of content negotiation was worked out when the idea of the web was that browsers would support a handful of media types (text, html, a couple of image types), and so it might be reasonable to send an 'accept:' header listing all of the types supported. But in practice as the web evolved, browsers would support hundreds of types of all varieties, and even automatically locate readers for content-types, so it wasn't practical to send an 'accept:' header for all of the types.

So content negotiation in practice doesn't use accept: headers except in limited circumstances; for the most part, the sites send some kind of 'active content' or content that autoselects for itself what else to download; e.g., a HTML page which contains Javascript code to detect the client's capabilities and figure out which other URLs to load. The most common kind of content negotiation uses the 'user agent' identification header, or some other 'x-...' extension headers to detect browser versions, among other things, to identify buggy implementations or proprietary extensions.

I think we should deprecate HTTP content negotiation, if only to make it clear to people reading the spec that it doesn't really work that way in practice. .

HTTP content negotiation has always seemed to me something that seems like a good idea in theory but didn't really seem to work out in practice. It's good to see one of the founding fathers of HTTP actually admit that it is an example of theory not matching reality. It's always good to remember that just because something is written in a specification from some standards body doesn't make it a holy writ. I've seen people debate online who throw out quotes from Roy Fieldings's dissertation and IETF RFCs as if they are evangelical preachers quoting chapter and verse from the Holy Bible.

Some of the things you find in specifications from the W3C and IETF are good ideas. However they are just that ideas. Sometimes technological advances make these ideas outdated and sometimes the spec authors simply failed to consider other perspectives for solving the problem at hand. Expecting a modern browser to send an itemized list of every file type that can be read by the applications on your operating system on every single GET request plus the priority in which these file types are preferred is simply not feasible or really useful in practice. It may have been a long time ago but not now. 

Similar outdated and infeasible ideas litter practically every W3C and IETF specification out there. Remember that the next time you quote chapter and verse from some Ph.D dissertation or IETF/W3C specification to justify a technology decision. Supporting standards is important but more important is applying critical thinking to the problem at hand. .

Thanks to Mark Baker for the link to Larry Masinter's post.


 

Categories: Web Development

I just noticed that last week the W3C published a working draft specification for The XMLHttpRequest Object. I found the end of the working draft somewhat interesting. Read through the list of references and authors of the specifcation below

References

This section is normative

DOM3
Document Object Model (DOM) Level 3 Core Specification, Arnaud Le Hors (IBM), Philippe Le Hégaret (W3C), Lauren Wood (SoftQuad, Inc.), Gavin Nicol (Inso EPS), Jonathan Robie (Texcel Research and Software AG), Mike Champion (Arbortext and Software AG), and Steve Byrne (JavaSoft).
RFC2119
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner.
RFC2616
Hypertext Transfer Protocol -- HTTP/1.1, R. Fielding (UC Irvine), J. Gettys (Compaq/W3C), J. Mogul (Compaq), H. Frystyk (W3C/MIT), L. Masinter (Xerox), P. Leach (Microsoft), and T. Berners-Lee (W3C/MIT).

B. Authors

This section is informative

The authors of this document are the members of the W3C Web APIs Working Group.

  • Robin Berjon, Expway (Working Group Chair)
  • Ian Davis, Talis Information Limited
  • Gorm Haug Eriksen, Opera Software
  • Marc Hadley, Sun Microsystems
  • Scott Hayman, Research In Motion
  • Ian Hickson, Google
  • Björn Höhrmann, Invited Expert
  • Dean Jackson, W3C
  • Christophe Jolif, ILOG
  • Luca Mascaro, HTML Writers Guild
  • Charles McCathieNevile, Opera Software
  • T.V. Raman, Google
  • Arun Ranganathan, AOL
  • John Robinson, AOL
  • Doug Schepers, Vectoreal
  • Michael Shenfield, Research In Motion
  • Jonas Sicking, Mozilla Foundation
  • Stéphane Sire, IntuiLab
  • Maciej Stachowiak, Apple Computer
  • Anne van Kesteren, Opera Software

Thanks to all those who have helped to improve this specification by sending suggestions and corrections. (Please, keep bugging us with your issues!)

Interesting. A W3C specification that documents a proprietary Microsoft API which not only does not include a Microsoft employee as a spec author but doesn't even reference any of the IXMLHttpRequest documentation on MSDN.

I'm sure there's a lesson in there somewhere. ;)


 

Categories: Web Development | XML

From the inaugural post from the Windows Live ID team's blog entitled The beginning of Windows Live ID we learn

Welcome to the Windows Live ID team blog!  This is our inaugural “Hello World!” post to introduce Windows Live ID.
 
Windows Live ID is the upgrade/replacement for the Microsoft Passport service and is the identity and authentication gateway service for cross-device access to Microsoft online services, such as Windows Live, MSN, Office Live and Xbox Live.  Is this the authentication service for the world?  No :-)  It's primarily designed for use with Microsoft online services and by Microsoft-affiliated close partners who integrate with Windows Live services to offer combined innovations to our mutual customers.  We will continue to support the Passport user base of 300+ Million accounts and seamlessly upgrade these accounts to Windows Live IDs.  Partners who have already implemented Passport are already compatible with Windows Live ID.
 
Windows Live ID is being designed to be an identity provider among many within the Identity Metasystem.  In the future, we will support Federated identity scenarios via WS-* and support InfoCards.  For developers we will be providing rich programmable interfaces via server and client SDKs to give third party application developers access to authenticated Microsoft Live services and APIs.
 
Over the next few weeks as we complete our deployment, you will see the Windows Live ID service come alive through our respective partners sites and services. 

I had a meeting with Trevin from the Passport Windows Live ID team to talk about their plans for providing server-based and client SDKs to give application developers the ability to access Windows Live services and APIs. I've been nagging him for a while with a lengthy list of requirements and it looks like they'll be delivering APIs that will enable very interesting uses of Windows Live quite soon.

This is shaping up to be a good year.


 

Categories: Windows Live

Niall Kennedy has a blog post entitled Creating a feed syndication platform at Microsoft where he writes

Starting next week I will join Microsoft's Windows Live division to create a new product team around syndication technologies such as RSS and Atom. I will help build a feed syndication platform leveraged by Microsoft products and developers all around the world. I am excited to construct a team and product from scratch focused on scalability and connecting syndication clients and their users wherever they may exist: desktop, mobile, media center, gaming console, widget, gadget, and more.

Live.com is the new default home page for users of the Internet Explorer 7 and the Windows Vista operating system. Live.com will be the first feed syndication experience for hundreds of millions of users who would love to add more content to their page, connect with friends, and take control of the flow of information in ways geeks have for years. I do not believe we have even begun to tap into the power of feeds as a platform and the possibilities that exist if we mine this data, connect users, and add new layers of personalization and social sharing. These are just some of the reasons I am excited to build something new and continue to change how the world can access new information as it happens

I spoke to Niall on the phone last week and I'm glad to see that he accepted our offer. When I was first hired to work in MSN Windows Live I was told I'd be working on three things; a blogging platform for MSN Spaces, a brand new social networking platform and an RSS platform. I've done the first two and was looking forward to working on the third but something has come up which will consume my attention for the near future. I promised the my management and the partner teams who were interested in this platform that I'd make sure we got the right person to work on this project. When I found out Niall was leaving Technorati it seemed like a match made in heaven. I recommended him for the job and talked to him on the phone about working at Microsoft. The people who will be working with him thought he was great and the rest has been history.

One of the questions Niall asked me last week was why I worked at Microsoft given I've written blog posts critical of the company. The answer to that question came easily for me, I told him that Microsoft is the one place I know I can build the kind of software and end-to-end experience I'd like. Nowhere else is there the the same breadth of software applications which can be brought together to give end users a unified experience. Where else can a punk like me build a social networking platform that is not only utilized by the most popular blogging platform in China but also in the world's most popular instant messaging application? And that's just the beginning. There is a lot of opportunity to build really impactful software at Windows Live. When I'm critical of Microsoft it's because I want us to be better company for people like me not because I don't like it here. Unfortunately, lots of people can't tell the difference. ;)

By the way, we are hiring. If you are interested in developer, test or program management positions building the biggest social computing platform on the planet then send your resume to dareo@msft.com (swap msft.com with microsoft.com).


 

Categories: Windows Live

Jeff Schneider has a blog post entitled You're so Enterprise... which is meant to be a response to a post I wrote entitled My Website is Bigger Than Your Enterprise. Since he neither linked to my post nor did he mention my full name, it's actually a coincidence I ever found his post. Anyway, he writes

In regard to the comment that Dare had made, "If you are building distributed applications for your business, you really need to ask yourself what is so complex about the problems that you have to solve that makes it require more complex solutions than those that are working on a global scale on the World Wide Web today." I tried to have a conversation with several architects on this subject and we immediately ran into a problem. We were trying to compare and contrast a typical enterprise application with one like Microsoft Live. Not knowing the MS Live architecture we attempted to 'best guess' what it might look like:

  • An advanced presentation layer, probably with an advance portal mechanism
  • Some kind of mechanism to facilitate internationalization
  • A highly scalable 'logic layer'
  • A responsive data store (cached, but probably not transactional)
  • A traditional row of web servers / maybe Akamai thing thrown in
  • Some sort of user authentication / access control mechanism
  • A load balancing mechanism
  • Some kind of federated token mechanism to other MS properties
  • An outward facing API
  • Some information was syndicated via RSS
  • The bulk of the code was done is some OO language like Java or C#
  • Modularity and encapsulation was encouraged; loose coupling when appropriate
  • Some kind of systems management and monitoring
  • Assuming that we are capturing any sensitive information, an on the wire encryption mechanism
  • We guessed that many of the technologies that the team used were dictated to them: Let's just say they didn't use Java and BEA AquaLogic.
  • We also guessed that some of the typical stuff didn't make their requirements list (regulatory & compliance issues, interfacing with CICS, TPF, etc., interfacing with batch systems, interfacing with CORBA or DCE, hot swapping business rules, guaranteed SLA's, ability to monitor state of a business process, etc.)
At the end of the day - we were scratching our heads. We DON'T know the MS Live architecture - but we've got a pretty good guess on what it looks like - and ya know what? According to our mocked up version, it looked like all of our 'Enterprise Crap'.

So, in response to Dare's question of what is so much more complex about 'enterprise' over 'web', our response was "not much, the usual compliance and legacy stuff". However, we now pose a new question to Dare:
What is so much more simple about your architecture than ours?

Actually, a lot of the stuff he talks about with regards to SLAs, monitoring business processes and regulatory issues are all things we face as part of building Windows Live. However it seems Jeff missed my point. The point is that folks building systems in places like Yahoo, Amazon and Windows Live are building systems that have to solve problems that are at the minimum just as complex as those of your average medium sized to large scale business. From his post, Jeff seems to agree with this core assertion. Yet people at these companies are embracing approaches such as RESTful web services and using scripting languages which are both often dissed as not being enterprise by complexity enterprise architects.

Just because a problem seems complex doesn't mean it needs a complex technology to solve it. For example, at its core RSS solves the same problem as WS-Eventing. I can describe all sorts of scenarios where RSS falls down and WS-Eventing does not. However RSS is good enough for a large number of scenarios for a smidgeon of the complexity cost of WS-Eventing. Then there are other examples where you have complex technologies like WS-ReliableMessaging that add complexity to the mix but often don't solve the real problems facing large scale services today. See my post More on Pragmatism and Web Services for my issues with WS-ReliableMessaging.  

My point remains the same. Complex problems do not necessarily translate to requiring complex solutions.

Question everything.


 

Categories: Web Development

A few weeks ago I blogged about the current beta of Social Networking in MSN Spaces for our Australian users. What I didn't mention is that just as most features in MSN Spaces are integrated with Windows Live Messenger, so also is the Friends list feature. Australian users of Windows Live Messenger will have three integration points for interacting with the Friends List. The first is that one can right-click on Messenger contacts and select "View->Friends List" to browse their Friends List. Another integration point is that one can respond to pending requests from people to add them to your Friends List directly from Messenger client (this is also the case with other features like Live Contacts).  Finally, one can also browse the Friends List from their Contact Card. Below is a screenshot of what happens when an Australian user right-clicks on one of their Messenger contacts and selects "View->Friends List". I can't wait till we finally ship this feature to all our users.

NOTE: There is a known bug that stops the Friends list from showing up if you are using the Internet Explorer 7 beta.


 

Categories: Windows Live

April 10, 2006
@ 02:53 PM

Via Shelley Powers I found out that Mark Pilgrim has restarted his blog with a new post entitled After the Bath. Ironically, I didn't find this out from my favorite RSS reader because it correctly supports the HTTP 410(GONE) status code which Mark's feed has been returning for over a year.

Mark Pilgrim's feed being resurrected from the dead is another example of why simply implementing support for Web specifications as written sometimes bites you on the butt. :)