Dare Obasanjo's weblog

Spaces & SkyDrive: Recent Releases from Windows Live

Over the past week, two Windows Live teams have shipped some good news to their users. The Windows Live SkyDrive team addressed the two most often raised issues with their service with the announcements in their post Welcome to the bigger, better, faster SkyDrive! which reads

You've made two things clear since our first release: You want more space; and you want SkyDrive where you are. Today we're giving you both. You now have five times the space you had before — that’s 5GB of free online storage for your favorite documents, pictures, and other files.

SkyDrive is also available now in 38 countries/regions. In addition to Great Britain, India, and the U.S., we’re live in Argentina, Australia, Austria, Belgium, Bolivia, Brazil, Canada, Chile, Colombia, Denmark, the Dominican Republic, Ecuador, El Salvador, Finland, France, Guatemala, Honduras, Italy, Japan, Mexico, the Netherlands, New Zealand, Nicaragua, Norway, Panama, Paraguay, Peru, Puerto Rico, Portugal, South Korea, Spain, Sweden, Switzerland, Taiwan, and Turkey.

Wow, Windows Live is just drowning our customers with free storage. Thats 5GB in SkyDrive and 5GB for Hotmail.

The Windows Live Spaces team also shipped some sweetness to their customers as well. This feature is a little nearer to my heart since it relies on Contact platform APIs I worked on a little while ago. The feature is described by Michelle in on the their team blog in a post entitled More information on Friends in common which states

In the friends module on another person’s space, there is a new area that highlights friends you have in common. Right away you can see the number of people you both know and the profile pictures of some of those friends.
Want to see the rest of your mutual friends? Click on In common and you’re taken to a full page view that shows all of your friends as well as separate lists of friends in common and friends that you don't have in common. This way you can also discover new people that you might know in real life, but are not connected with on Windows Live.

Finding friends in common is also especially important when planning an event on Windows Live Events. Who wants to go to a party when none of your friends are going?
On the Guest list area of every event, you can now quickly see how many of your friends have also been invited to the event. Just click on See who’s going and see whether or not your friends are planning to go.

Showing mutual friends as shown above is one of those small features that makes a big impact on the user experience. Nice work Michelle and Shu on getting this out the door.

Now playing: Iconz - I Represent

Categories: Windows Live

February 28, 2008

@ 04:31 PM

No Contest: FriendFeed vs. The Facebook News Feed

I found Charles Hudson’s post FriendFeed and the Facebook News Feed - FriendFeed is For Sharing and Facebook Used to be About my Friends somewhat interesting since one of the things I’ve worked on recently is the What’s New page on Windows Live Spaces. He writes

I was reading this article on TechCrunch “Facebook Targets FriendFeed; Opening Up The News Feed” and I found it kind of interesting. As someone who uses FriendFeed a lot and uses Facebook less and less, I don’t think the FriendFeed team should spend much time worrying about this announcement. The reason is really simple.

In the beginning, the Facebook News Feed was really interesting. It was all information about my friend and what they were doing. Over time, it’s become a lot less interesting.
…
I would like to see Facebook separate “news” from “activity” - “news” is stuff that happened to people (person x became friend with person y, person x is no longer in a relationship, status updates, etc) and “activities” are stuff related to applications, content sharing, etc. Trying to stuff news and activity into the same channel results in a lot of chaos and noise.

FriendFeed is really different. To me, FriendFeed is a community of people who like to share stuff. That’s a very different product proposition than what the News Feed originally set out to do.

This is an example of a situation where I agree with the sentiment in Jeff Atwood’s post I Repeat: Do Not Listen to Your Users. This isn’t to say that Charles Hudson’s increasingly negative user experience with the Facebook should be discounted or that the things he finds interesting about FriendFeed are invalid. The point is that in typical end user fashion, Charles’s complaints contradict themselves and his suggestions wouldn’t address the actual problems he seems to be having.

The main problem Charles has with the news feed on Facebook is its increased irrelevance due to massive amounts of application spam. This has nothing to do with FriendFeed being more of a community site than Facebook. This also has nothing to do with separating “news” from “activity” (whatever that means). Instead it has everything to do with the fact that Facebook platform is an attractive target for applications attempting to “grow virally” to send all sorts of useless crap to people’s friends. Friendfeed doesn’t have that problem because everything that shows up in your feed is pulled from a carefully selected list of services shown below

The 28 services supported by FriendFeed

The thing about the way FriendFeed works is that there is little chance that stuff in the feed would be considered spammy because the content in the feed will always correspond to a somewhat relevant user action (Digging a story, adding a movie to a Netflix queue, uploading photos to Flickr, etc).

So this means one way Facebook can add relevance to the content in their feed is to pull data in from more valid sources instead of relying on spammy applications pushing useless crap like “Dare’s level 17 zombie just bit Rob’s level 12 vampire”.

That’s interesting but there is more. There doesn’t seem to be any tangible barrier to entry in the “market” that Friendfeed is targetting since all they seem to be doing is pulling the public RSS feeds from a handful of Web sites. This is the kind of project I could knock out in two months. The hard part is having a scalable RSS processing platform. However we know Facebook already has one for their feature which allows one to import blog posts as Notes. So that makes it the kind of feature an enterprising dev at Facebook could knock out in a week or two.

The only thing Friendfeed may have going for it is the community that ends up adopting it. The tricky thing about social software is that your users are as pivotal to your success as your features. Become popular with the right kind of users and your site blows up (e.g. MySpace) while with a different set of users your site eventually stagnates due to it’s niche nature (e.g. LiveJournal).

Friendfeed reminds me of Odeo; a project by some formerly successful entrepenuers that tries to jump on a hyped bandwagon without actually scratching an itch that the founders have or fully understanding the space.

Now playing: Jae Millz - No, No, No (remix) (feat. Camron & T.I.)

Categories: Social Software

February 27, 2008

@ 03:51 PM

RSS Bandit: Progress Report on Integrating with the Windows RSS Platform

Race to the Bottom [pic]

Now playing: Supremes - Where Did Our Love Go?

Categories: Mindless Link Propagation

February 24, 2008

@ 08:03 PM

Comments [2]

I'm slowly working towards the goal of making RSS Bandit a desktop RSS client for Google Reader, NewsGator Online and Exchange (via the Windows RSS platform). Today I made some progress integrating with the Windows RSS platform but as with any integration story it is some good news and some bad news. The good news can be seen in the screen shot below

RSS Bandit and Internet Explorer sharing the same feed list

The good news is that for the most part the core application has been refactored to be able to transparently support loading feeds from sources such as the Windows RSS platform or from the RSS Bandit feed cache. It should take one or two more weekends and I can move on to adding similar support for synchronizing feeds from Google Reader.

The bad news is that using the Windows RSS platform has been a painful exercise. My current problem is that for some reason I can't fathom I can't receive events from the Windows RSS platform. I can write the same code and receive events from a standalone program but for some reason the event handlers aren't received triggered when the exact same code is running in RSS Bandit. The main problem I had turned out to have been due to a stupid oversight. With that figured out we're about 80% done with integration with the Windows RSS platform. There are lots of smaller issues too, such as the fact that there is no event that indicates an enclosure has finished being downloaded although the documentation seems to imply the FeedDownloadCompleted does double duty. Or the various exceptions that can occur when accessing properties of a feed including BadImageFormatException for accessing IFeed.Title if the underlying feed file has been corrupted somehow or a COMException complaining that the "Element not found" if you access IFeed.DownloadUrl before you've attempted to download the feed.

I've used up my budget of free time for coding this weekend so I'll start up again next weekend. In the meantime, if you have any tips on working with the Windows RSS platform from C#, don't hesitate to share.

Now Playing: Bone Thugs 'N Harmony - No Surrender

Categories: RSS Bandit

February 23, 2008

@ 04:00 AM

Slashdotters on Google's Foray Into Health Services [pic]

Now playing: DRS - Sickness

Categories: Competitors/Web Companies

February 23, 2008

@ 04:00 AM

More Thoughts on an HTTP PATCH and AtomPub

Sam Ruby has an insightful response to Joe Gregorio in his post APP Level Patch where he writes

Joe Gregorio: At Google we are considering using PATCH. One of the big open questions surrounding that decision is XML patch formats. What have you found for patch formats and associated libraries?

I believe that looking for an XML patch format is looking for a solution at the wrong meta level. Two examples, using AtomPub:

In Atom, the order of elements in an entry is not significant. AtomPub servers often do not store their data in XML serialized form, or even in DOM form. If you PUT an entry, and then send a PATCH based on the original serialization, it may not be understood.
A lot of data in this world is either not in XML, or if it is in XML, is simply there via tunneling. Atom elements are often merely thin wrappers around HTML. HTML has a DOM, and can flattened into a sequence of SAX like events, just like XML can be.

I totally agree with Sam. A generic “XML patch format” is totally the wrong solution. At Microsoft we had several different XML patch formats produced by the same organization because each targetted a different scenario

Diffgram: Represent a relational database table and changes to it as XML.
UpdateGram: Represent changes to an XML view of one or more relational database tables optionally including a mapping from relational <-> XML data
Patchgram: Represent infoset level differences between two XML documents

Of course, these are one line sumarries but you get the point. Depending on your constraints, you’ll end up with a different set of requirements. Quick test, tell me why one would choose Patchgrams over XUpdate and vice versa?

Given the broad set of constraints that will exist in different server implementations of the Atom Publishing Protocol, a generic XML patch format will have lots of features which just don’t make sense (e.g. XUpdate can create processing instructions, Patchgrams use document ordered positions of nodes for matching).

If you decide you really need a patch format for Atom documents, your best bet is working with the community to define one or more which are specific to the unique constraints of the Atom syndication format instead of hoping that there is a generic XML patch format out there you can shoehorn into a solution. In the words of Joe Gregorio’s former co-worker, “I make it fit!”.

Personally, I think you’ll still end up with so many different requirements (Atom stores backed by actual text documents will have different concerns from those backed by relational databases) and spottiness in supporting the capability that you are best off just walking away from this problem by fixing your data model. As I said before, if you have sub-resources which you think should be individually editable then give them a URI and make them resources as well complete with their own atom:entry element.

Now playing: Oomp Camp - Time To Throw A Chair

Categories: XML Web Services

February 23, 2008

@ 04:00 AM

How "View Source" Broke the Web

About five years ago, I was pretty active on the XML-DEV mailing list. One of the discussions that cropped up every couple of weeks (aka permathreads) was whether markup languages could be successful if they were not simple enough that a relatively inexperienced developer could “View Source” and figure out how to author documents in that format. HTML (and to a lesser extent RSS) are examples of the success of the “View Source” principle. Danny Ayers had a classic post on the subject titled The Legend of View ‘Source’ which is excerpted below

Q: How do people learn markup?
A: 'View Source'.

This notion is one of the big guns that gets wheeled out in many permathreads - 'binary XML', 'RDF, bad' perhaps even 'XML Schema, too
complicated'. To a lot of people it's the show stopper, the argument that can never be defeated. Not being able to view source is the reason format X died; being able to view source is the reason for format Y's success.

But I'm beginning to wonder if this argument really holds water any more. Don't get me wrong, I'm sure it certainly used to be the case, that many people here got their initial momentum into XML by looking at that there text. I'm also sure that being able to view existing source can be a great aid in learning a markup language. What I'm questioning is whether the actual practice of 'View Source' really is so widespread these days, and more importantly whether it offers such benefits for it to be a major factor in language decisions. I'd be happy with the answer to : are people really using 'View Source' that much? I hear it a lot, yet see little evidence.
…
One last point, I think we should be clear about what is and what isn't 'View Source'. If I need an XSLT stylesheet the first thing I'll do is open an existing stylesheet and copy and paste half of it. Then I'll get Michael's reference off the shelf. I bet a fair few folks here have the
bare-bones HTML 3.2 document etched into their lower cortex. But I'd argue that nothing is actually gained from 'View Source' in this, all it is is templating, the fact that it's a text format isn't of immediate relevance.

The mistake Danny made in his post was taking the arguments in favor of “View Source” literally. In hindsight, I think the key point of the “View Source” clan was that it is clear that there is a lot of cargo cult programming that goes on in the world of Web development. Whether it is directly via using the View Source feature of popular Web browsers or simply cutting and pasting code they find at places like quirks mode, A List Apart and W3C Schools, the fact is that lots of people building Web pages and syndication feeds are using technology and techniques they barely understand on a daily basis.

Back in the days when this debate came up, the existence of these markup cargo cults was celebrated because it meant that the ability to author content on the Web was available to the masses which is still the case today (Yaaay, MySpace Wink ). However there has been a number of down sides to the wide adoption of [X]HTML, CSS and other Web authoring technologies by large numbers of semi-knowledgeable developers and technologically challenged content authors.

One of these negative side effects has been discussed to death in a number of places including the article Beyond DOCTYPE: Web Standards, Forward Compatibility, and IE8 by Aaron Gustafson which is excerpted below

The DOCTYPE switch is broken

Back in 1998, Todd Fahrner came up with a toggle that would allow a browser to offer two rendering modes: one for developers wishing to follow standards, and another for everyone else. The concept was brilliantly simple. When the user agent encountered a document with a well-formed DOCTYPE declaration of a current HTML standard (i.e. HTML 2.0 wouldn’t cut it), it would assume that the author knew what she was doing and render the page in “standards” mode (laying out elements using the W3C’s box model). But when no DOCTYPE or a malformed DOCTYPE was encountered, the document would be rendered in “quirks” mode, i.e., laying out elements using the non-standard box model of IE5.x/Windows.
…
Unfortunately, two key factors, working in concert, have made the DOCTYPE unsustainable as a switch for standards mode:

egged on by A List Apart and The Web Standards Project, well-intentioned developers of authoring tools began inserting valid, complete DOCTYPEs into the markup their tools generated; and
IE6’s rendering behavior was not updated for five years, leading many developers to assume its rendering was both accurate and unlikely to change.

Together, these two circumstances have undermined the DOCTYPE switch because it had one fatal flaw: it assumed that the use of a valid DOCTYPE meant that you knew what you were doing when it came to web standards, and that you wanted the most accurate rendering possible. How do we know that it failed? When IE 7 hit the streets, sites broke.

Sure, as Roger pointed out, some of those sites were using IE-6-specific CSS hacks (often begrudgingly, and with no choice). But most suffered because their developers only checked their pages in IE6 —or only needed to concern themselves with how the site looked in IE6, because they were deploying sites within a homogeneous browserscape (e.g. a company intranet). Now sure, you could just shrug it off and say that since IE6’s inaccuracies were well-documented, these developers should have known better, but you would be ignoring the fact that many developers never explicitly opted into “standards mode,” or even knew that such a mode existed.

This seems like an intractible problem to me. If you ship a version of your software that is more standards compliant than previous versions you run the risk of breaking applications or content that worked in previous versions. This reminds me of Windows Vista getting the blame because Facebook had a broken IPv6 record. The fact is that the application can claim it is more standards compliant but that is meaningless if users can no longer access their data or visit their favorite sites. In addition, putting the onus on Web developers and content authors to always write standards compliant code is impossible given the acknowledged low level of expertise of said Web content authors. It would seem that this actually causes a lot of pressure to always be backwards (or is that bugwards) compatible. I definitely wouldn’t want to be in the Internet Explorer team’s shoes these days.

It puts an interesting wrinkle on the exhortations to make markup languages friendly to “View Source” doesn’t it?

Now playing: Green Day - Welcome To Paradise

Categories: Web Development

February 21, 2008

@ 05:17 PM

Microsoft Announces Data Portability Principles for Office 2007, Exchange Server 2008, Office Sharepoint Server 2007, and Windows Server 2008

From the press release entitled Microsoft Makes Strategic Changes in Technology and Business Practices to Expand Interoperability we learn

REDMOND, Wash. — Feb. 21, 2008 — Microsoft Corp. today announced a set of broad-reaching changes to its technology and business practices to increase the openness of its products and drive greater interoperability, opportunity and choice for developers, partners, customers and competitors.

Specifically, Microsoft is implementing four new interoperability principles and corresponding actions across its high-volume business products: (1) ensuring open connections; (2) promoting data portability; (3) enhancing support for industry standards; and (4) fostering more open engagement with customers and the industry, including open source communities.
...
The interoperability principles and actions announced today apply to the following high-volume Microsoft products: Windows Vista (including the .NET Framework), Windows Server 2008, SQL Server 2008, Office 2007, Exchange Server 2007, and Office SharePoint Server 2007, and future versions of all these products. Highlights of the specific actions Microsoft is taking to implement its new interoperability principles are described below.

Ensuring open connections to Microsoft’s high-volume products.

Documenting how Microsoft supports industry standards and extensions.

Enhancing Office 2007 to provide greater flexibility of document formats.

Launching the Open Source Interoperability Initiative.

Expanding industry outreach and dialogue.

More information can be found on the Microsoft Interoperability page. Nice job, ROzzie and SteveB.

Now playing: Timbaland - Apologize (Feat. One Republic)

Categories: Life in the B0rg Cube

February 21, 2008

@ 01:24 PM

Comments [5]

Facebook Moves to Curtail Application Spam: What Took So Long?

One of the biggest problems with the Facebook user experience today is the amount of spam from applications that are trying to leverage its social networks to "grow virally". For this reason, it is unsurprising to read the blog post from Paul Jeffries on the Facebook blog entitled Application Spam where he writes

We've been working on several improvements to prevent this and other abuses by applications. We'll continue to make changes, but wanted to share some of what's new:

When you get a request from an application, you now have the ability to "Block Application" directly from the request. If you block an application, it will not be able to send you any more requests.

A few weeks ago, we added the ability to "Clear All" requests from your requests page when you have a lot of requests and invitations that you haven't responded to yet.

Your feedback now determines how many communications an application can send. When invitations and notifications are ignored, blocked, or marked as spam, Facebook reduces that application's ability to send more. Applications forcing their users to send spammy invitations can wind up with no invitations at all. The power is in your hands; block applications that are bothering you, and report spammy or abusive communications, and we'll restrict the application.

We've explicitly told developers they cannot dead-end you in an "Invite your Friends" loop. If you are trapped by an application, look for a link to report that "This application is forcing me to invite friends". Your reports will help us stop this behavior.

We've added an option to the Edit Applications page that allows you to opt-out of emails sent from applications you've already added. When you add a new application, you can uncheck this option right away.

A lot of these are fairly obvious restrictions that put users back in control of their experience. I'm quite surprised that it took so long to add a "Block Application" feature. I can understand that Facebook didn't want to piss off developers on their platform but app spam has become a huge negative aspect of using Facebook. About two months ago, I wrote a blog post entitled Facebook: Placing Needs of Developers Over Needs of Users where I pointed out the Facebook group This has got to stop (POINTLESS FACEBOOK APPLICATIONS ARE RUINING FACEBOOK). At the time of posting that entry, the group had 167,186 members.

This morning, the group has 480,176 members. That's almost half a million people who have indicated that app spam on the site is something they despise. It is amazing that Facebook has let this problem fester for so long given how important keeping their user base engaged and happy with the site is to their bottom line.

Now Playing: Lil' Scrappy feat. Paul Wall - Hustle Man

Categories: Social Software

February 16, 2008

@ 07:29 PM

Comments [8]

Thoughts on Google's Proposal for Granular Updates in AtomPub

Via Sam Ruby's post Embrace, Extend then Innovate I found a link to Joe Gregorio's post entitled How to do RESTful Partial Updates. Joe's post is a recommendation of how to extend the Atom Publishing Protocol (RFC 5023) to support updating the properties of an entry without having to replace the entire entry. Given that Joe works for Google on GData, I have assumed that Joe's post is Google's attempt to float a trial balloon before extending AtomPub in this way. This is a more community centric approach than the company has previously taken with GData, OpenSocial, etc where these protocols simply appeared out of nowhere with proprietary extensions to AtomPub with an FYI to the community after the fact.

The Problem Statement

In the Atom Publishing Protocol, an atom:entry represents an editable resource. When editing that resource, it is intended that an AtomPub client should download the entire entry, edit the fields it needs to change and then use a conditional PUT request to upload the changed entry.

So what's the problem? Below is an example of the results one could get from invoking the users.getInfo method in the Facebook REST API.

<user> <uid>8055</uid> <about_me>This field perpetuates the glorification of the ego. Also, it has a character limit.</about_me> <activities>Here: facebook, etc. There: Glee Club, a capella, teaching.</activities> <birthday>November 3</birthday> <books>The Brothers K, GEB, Ken Wilber, Zen and the Art, Fitzgerald, The Emporer's New Mind, The Wonderful Story of Henry Sugar</books> <current_location> <city>Palo Alto</city> <state>CA</state> <country>United States</country> <zip>94303</zip> </current_location> <first_name>Dave</first_name> <interests>coffee, computers, the funny, architecture, code breaking,snowboarding, philosophy, soccer, talking to strangers</interests> <last_name>Fetterman</last_name> <movies>Tommy Boy, Billy Madison, Fight Club, Dirty Work, Meet the Parents, My Blue Heaven, Office Space </movies> <music>New Found Glory, Daft Punk, Weezer, The Crystal Method, Rage, the KLF, Green Day, Live, Coldplay, Panic at the Disco, Family Force 5</music> <name>Dave Fetterman</name> <profile_update_time>1170414620</profile_update_time> <relationship_status>In a Relationship</relationship_status> <religion/> <sex>male</sex> <significant_other_id xsi:nil="true"/> <status> <message>Pirates of the Carribean was an awful movie!!!</message> </status> </user>

If this user was represented as an atom:entry then each time an application wants to edit the user's status message it needs to download the entire data for the user with its over two dozen fields, change the status message in an in-memory representation of the XML document and then upload the entire user atom:entry back to the server. This is a fairly expensive way to change a status message compared to how this is approached in other RESTful protocols (e.g. PROPPATCH in WebDAV).

Previous Discussions on this Topic: When the Shoe is on the Other Foot

A few months ago I brought up this issue as one of the problems encountered when using the Atom Publishing Protocol outside of blog editing contexts in my post Why GData/APP Fails as a General Purpose Editing Protocol for the Web. In that post I wrote

Lack of support for granular updates to fields of an item: As mentioned in the previous section editing an entry requires replacing the old entry with a new one. The expected client interaction with the server is described in section 5.4 of the current APP draft and is excerpted below.
Retrieving a Resource
Client                                     Server
  |                                           |
  |  1.) GET to Member URI                    |
  |------------------------------------------>|
  |                                           |
  |  2.) 200 Ok                               |
  |      Member Representation                |
  |<------------------------------------------|
  |                                           |
The client sends a GET request to the URI of a Member Resource to retrieve its representation.

The server responds with the representation of the Member Resource.

Editing a Resource
Client                                     Server
  |                                           |
  |  1.) PUT to Member URI                    |
  |      Member Representation                |
  |------------------------------------------>|
  |                                           |
  |  2.) 200 OK                               |
  |<------------------------------------------|
The client sends a PUT request to store a representation of a Member Resource.

If the request is successful, the server responds with a status code of 200.
Can anyone spot what's wrong with this interaction? The first problem is a minor one that may prove problematic in certain cases. The problem is pointed out in the note in the documentation on Updating posts on Google Blogger via GData which states

IMPORTANT! To ensure forward compatibility, be sure that when you POST an updated entry you preserve all the XML that was present when you retrieved the entry from Blogger. Otherwise, when we implement new stuff and include <new-awesome-feature> elements in the feed, your client won't return them and your users will miss out! The Google data API client libraries all handle this correctly, so if you're using one of the libraries you're all set.

Thus each client is responsible for ensuring that it doesn't lose any XML that was in the original atom:entry element it downloaded. The second problem is more serious and should be of concern to anyone who's read Editing the Web: Detecting the Lost Update Problem Using Unreserved Checkout. The problem is that there is data loss if the entry has changed between the time the client downloaded it and when it tries to PUT its changes.

That post was negatively received by many members of the AtomPub community including Joe Gregorio. Joe wrote a scathing response to my post entitled In which we narrowly save Dare from inventing his own publishing protocol where he addressed that particular issue as follows

The second complaint is one of data loss:

The problem is that there is data loss if the entry has changed between the time the client downloaded it and when it tries to PUT its changes.

Fortunately, the only real problem is that Dare seems to have only skimmed the specification. From Section 9.3:

To avoid unintentional loss of data when editing Member Entries or Media Link Entries, Atom Protocol clients SHOULD preserve all metadata that has not been intentionally modified, including unknown foreign markup as defined in Section 6 of [RFC4287].

And further, from Section 9.5:

Implementers are advised to pay attention to cache controls, and to make use of the mechanisms available in HTTP when editing Resources, in particular entity-tags as outlined in [NOTE-detect-lost-update]. Clients are not assured to receive the most recent representations of Collection Members using GET if the server is authorizing intermediaries to cache them.

Hey look, we actually reference the lost update paper that specifies how to solve this problem, right there in the spec! And Section 9.5.1 even shows an example of just such a conditional PUT failing. Who knew? And just to make this crystal clear, you can build a server that is compliant to the APP that accepts only conditional PUTs. I did, and it performed quite well at the last APP Interop.

The bottom line of Joe's response is that he didn't think it was a real problem. My assumption is that his perspective on the problem has broadened now that he has a responsibility to the wide breadth of AtomPub implementations at Google as opposed to when his design decisions were being influenced by a home grown blogging server he wrote in his free time.

The Google Solution: Embrace, Extend then Innovate

Now that Joe thinks supporting granular updates of a resource is a valid scenario, he and the folks at Google have proposed the following solution to the problem. Joe writes

Now if I wanted to update part of this entry, say the title, using the mechanisms in RFC 5023 then I would change the value of the title element and PUT the whole modified entry back to the the URI http://example.org/edit/first-post.atom. Now this document isn't large, but we'll use it to demonstrate the concepts. The first thing we want to do is add a URI Template that allows us to construct a URI to PUT changes back to:
<?xml version="1.0"?>
<entry         
        xmlns="http://www.w3.org/2005/Atom"
        xmlns:t="http://blah...">
<t:link_template ref="sub" 
        href="http://example.org/edit/first-post/{-listjoin|;|id}"/>
    <title>Atom-Powered Robots Run Amok</title>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <author><name>John Doe</name></author>
    <content>Some text.</content>
    <link rel="edit"
        href="http://example.org/edit/first-post.atom"/>
</entry>
Then we need to add id's to each of the pieces of the document we wish to be able to individually update. For this we'll use the W3C xml:id specification:
<?xml version="1.0"?>
<entry         
        xmlns="http://www.w3.org/2005/Atom"
        xmlns:t="http://blah...">   
    <t:link_template ref="sub" href="http://example.org/edit/first-post/{-listjoin|;|id}"/>
    <title xml:id="X1">Atom-Powered Robots Run Amok</title>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <author xml:id="X2"><name>John Doe</name></author>
    <content xml:id="X3">Some text.</content>
    <link rel="edit"
        href="http://example.org/edit/first-post.atom"/>
</entry>
So if I wanted to update both the content and the title I would construct the partial update URI using the id's of the elements I want to update:

http://example.org/edit/first-post/X1;X3

And then I would PUT an entry to the URI with only those child elements:
PUT /edit/first-post/X1;X3
Host: example.org

<?xml version="1.0"?>
<entry xmlns="http://www.w3.org/2005/Atom">
   <title xml:id="X1">False alarm on the Atom-Powered Robots things</title>
   <content xml:id="X3">Sorry about that.</content>
</entry>

The Problems with the Google Solution: Your Shipment of FAIL has Arrived

Ignoring the fact that this spec depends on specifications that are either experimental (URI Templates) or not widely supported (xml:id), there are still significant problems with how this approach (mis)uses the Atom Publishing Protocol. Sam Ruby eloquently points out the problems in his post Embrace, Extend then Innovate where he wrote

With HTTP PUT, the the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server. Having some servers interpret the removal of elements (such as content) as a modification, and others interpret the requests in such a way that elided elements are to be left alone is hardly uniform or self-descriptive. In fact, depending on usage, it is positively stateful.

I’m fine with a server choosing to interpret the request anyway it sees fit. As a black box, it could behave as if it updated the resource as requested and then one nanosecond later — and before it processes any other requests — fill in missing data with defaults, historical data, whatever. My concern is with clients coding with to the assumption as to how the server works. That’s called coupling.

The main problem is that it changes the expected semantics of HTTP PUT in a way that not only conflicts with how PUT is typically used in other HTTP-based protocols but also how it is used in AtomPub. It's also weird that the existence of xml:id in an Atom document is now used to imply special semantics (i.e. this field supports direct editing). I especially don't like that after all is said and done, the server controls which fields can be partially updated or not which seems to imply a tight coupling between clients and servers (e.g. some servers will support partial updates on all fields, some may only support partial updates on atom:title + atom:category while others will support partial updates on a different set of fields). So the code for editing a title or category changes depending on which AtomPub service you are talking to.

From where I stand Joe has pretty much invented yet another diff + patch protocol for XML documents. When I worked on the XML team at Microsoft, there were quite a few floating around the company including Diffgram, UpdateGram, and Patchgrams to name three. So I've been around the block when it comes to diff + patch formats for XML and this one has its share of issues. The most eye brow raising issue with the diff + patch protocol is that half the semantics of the update are in the XML document (which elements to add/edit) while the other half are in the URL (if an ID exists in the URL but is not in the document then it is a delete). This means the XML isn't very self describing nor can it really be said that the URL is identifying a resource [more like it identifies an operation].

Actual Solution: Read the Spec

In Joe's original response to my post his suggestion was that the solution to the "problem" of lack of support for granular updates of entries in AtomPUb is to read the spec. In retrospect, I agree. If a field is important enough that it needs to be identifiable and editable then it should be its own resource. If you want to make it part of another resource then use atom:link to link both resources.

Case closed. Problem solved.

Now Playing: Too Short - Couldn't Be a Better Player Than Me (feat. Lil Jon & The Eastside Boyz)

Categories: Syndication Technology | XML Web Services

February 16, 2008

@ 07:28 PM

The Windows Live Spaces Photo API (alpha)

It's a testament to how busy I've been at work focusing on the Contacts platform that I missed an announcement by Angus Logan a few months ago that there had been an alpha release of a REST API for accessing photos on Windows Live Spaces. The MSDN page for the API describes the API as

Welcome to the Alpha release of the Windows Live Spaces Photos API. The Windows Live Spaces Photo API allows Web sites to view and update Windows Live Spaces photo albums using the WebDAV protocol. Web sites can incorporate the following functionality:

Upload or download photos.

Create, edit, or delete photo albums.

Request a list of a user's albums, photos, or comments.

Edit or delete content for an existing entry.

Query the content in an existing entry.

This news is of particular interest to me since this API is the fruits of my labor that was first hinted at in my post A Flickr-like API for MSN Spaces? from a little over two years ago. At the time, I was responsible for the public APIs for ~~MSN~~ Windows Live Spaces and had just finished working on the the MetaWeblog API for Windows Live Spaces.

The biggest design problem we faced at the time was how to give applications the ability to access a user's personal data which required the user to be authenticated without having dozens of hastily written applications collecting people's usernames and passwords. In general, if we were just a blogging site it may not have been a big deal (e.g. the Twitter API requires that you give your username & password to random apps which may or may not be trustworthy). However we were part of ~~MSN~~ Windows Live which meant that we had to ensure that users credentials were safeguarded and we didn't end up training users on how to be phished by entering their ~~Passport~~ Windows Live ID credentials into random applications and Web sites.

To get around this problem with our implementation of the MetaWeblog API, I came up with a scheme where users had to use a special username and password when accessing their Windows Live Spaces blog via the API. This was a quick & dirty hack which had plenty of long term problems with it. For one, users had to go through the process of "enabling API access" before they could use blogging tools or other Metaweblog API clients with the service. Another problem was that the problem still wasn't solved for other Windows Live services that wanted to enable APIs. Should each API have its own username and password? That would be quite confusing and overwhelming for users. Should they re-use our API specific username and password? In that case we would be back to square one by exposing an important set of user credentials to random applications.

The right solution eventually decided upon was to come up with a delegated authentication model where a user grants application permission to act on his or her behalf without having to share credentials with the application. This is the model followed by the Windows Live Contacts API, the Facebook API, Google AuthSub, Yahoo! BBAuth, the Flickr API and a number of other services on the Web that provide APIs to access a user's private data.

Besides that decision, there was also the question of what form the API should take. Should we embrace & extend the MetaWeblog API with extensions for managing photos & media? Should we propose a proprietary API based on SOAP or REST? Adopt someone else's proprietary API (e.g. the Flickr API)? At the end, I pushed for completely RESTful and completely standards based. Thus we built the API on WebDAV (RFC 2518).

WebDAV seemed like a great fit for a lot of reasons.

Photo albums map quite well to collections which are often modeled as folders by WebDAV clients.
Support for WebDAV already baked into a lot of client applications on numerous platforms
It is RESTful which is important when building a protocol for the Web
Proprietary metadata could easily be represented as WebDAV properties
Support for granular updates of properties via PROPPATCH

The last one turns out to be pretty important as it is an issue today with everyone's favorite REST protocol du jour. More on that topic in my following post.

Now Playing: Lil Jon & The Eastside Boyz - Put Yo Hood Up (remix) (feat. Jadakiss, Petey Pablo & Chyna White)

Categories: Windows Live | XML Web Services

February 16, 2008

@ 07:27 PM

Comments [0]

ADO.NET Data Services (Astoria) Adopts AtomPub

Pablo Castro has a blog post entitled AtomPub support in the ADO.NET Data Services Framework where he talks about the progress they've made in building a framework for using the Atom Publishing Protocol (RFC 5023) as a protocol for communicating with SQL Server and other relational databases. Pablo explains why they've chosen to build on AtomPub in his post which is excerpted below

Why are we looking at AtomPub?

Astoria data services can work with different payload formats and to some level different user-level details of the protocol on top of HTTP. For example, we support a JSON payload format that should make the life of folks writing AJAX applications a bit easier. While we have a couple of these kind of ad-hoc formats, we wanted to support a pre-established format and protocol as our primary interface.

If you look at the underlying data model for Astoria, it boils down to two constructs: resources (addressable using URLs) and links between those resources. The resources are grouped into containers that are also addressable. The mapping to Atom entries, links and feeds is so straightforward that is hard to ignore. Of course, the devil is in the details and we'll get to that later on.

The interaction model in Astoria is just plain HTTP, using the usual methods for creating, updating, deleting and retrieving resources. Furthermore, we use other HTTP constructs such as "ETags" for concurrency checks, "location" to know where a POSTed resource lives, and so on. All of these also map naturally to AtomPub.

From our (Microsoft) perspective, you could imagine a world where our own consumer and infrastructure services in Windows Live could speak AtomPub with the same idioms as Astoria services, and thus could both have a standards-based interface and also use the same development tools and runtime components that work with any Astoria-based server. This would mean less clients/development tools for us to create and more opportunity for our partners in the libraries and tools ecosystem out there.

Although I'm not responsible for any public APIs at Microsoft these days, I've found myself drawn into the various internal discussions on RESTful protocols and AtomPub due to the fact that I'm a busy body. :)

Early on in the Atom effort, I felt that the real value wasn't in defining yet another XML syndication format but instead in the editing protocol. Still I underestimated how much mind share and traction AtomPub would eventually end up getting in the industry. I'm glad to see Microsoft making a huge bet on standards based, RESTful protocols especially given our recent history where we foisted Snakes On A Plane on the industry.

However since AtomPub is intended to be an extensible protocol, Astoria has added certain extensions to make the service work for their scenarios while staying within the letter and spirit of the spec. Pablo talks about some of their design decisions when he writes

We are simply mapping whatever we can to regular AtomPub elements. Sometimes that is trivial, sometimes we need to use extensions and sometimes we leave AtomPub alone and build an application-level feature on top. Here is an initial list of aspects we are dealing with in one way or the other. We’ll also post elaborations of each one of these to the appropriate Atom syntax|protocol mailing lists.
...
c) Using AtomPub constructs and extensibility mechanisms to enable Astoria features:

Inline expansion of links (“GET a given entry and all the entries related through this named link”, how we represent a request and the answer to such a request in Atom?).

Properties for entries that are media link entries and thus cannot carry any more structured data in the <content> element

HTTP methods acting on bindings between resources (links) in addition to resources themselves

Optimistic concurrency over HTTP, use of ETags and in general guaranteeing consistency when required

Request batching (e.g. how does a client send a set of PUT/POST/DELETE operations to the server in a single go?)

d) Astoria design patterns that are not AtomPub format/protocol concepts or extensions:

Astoria gives semantics to URLs and has a specific syntax to construct them

How metadata that describes the structure of a service end points is exposed. This goes from being to find out entry points (e.g. collections in service documents) to having a way of discovering the structure of entries that contain structured data

Pablo will be posting more about the Astoria design decisions on atom-syntax and atom-protocol in the coming weeks. It'll be interesting to see the feedback on the approaches they've taken with regards to following the protocol guidelines and extending it where necessary.

It looks like I'll have to renew my subscription to both mailing lists.

Now Playing: Lil Jon & The Eastside Boyz - Grand Finale (feat Nas, Jadakiss, T.I., Bun B & Ice Cube)

Categories: Platforms | XML Web Services

February 13, 2008

@ 08:04 AM

Comments [7]

Yahoo! Layoffs: How Screwed Up is Yahoo?

News of the layoffs at Yahoo! has now hit the presses. A couple of the folks who've been indicated as laid off are people I know are great employees either via professional interaction or by reputation. The list of people who fit this bill so far are Susan Mernitt, Bradley Horowitz, Salim Ismail and Randy Farmer. Salim used to run Yahoo's "incubation" wing so this is a pretty big loss. It is particularly interesting that he volunteered to leave the company which may be a coincidence or could imply that some of the other news about Yahoo! has motivated some employees to seek employment elsewhere. It will be interesting to see how this plays out in the coming months.

Randy Farmer is also a surprise given that he pretty much confirmed that he was working on Jerry Yang's secret plan for a Yahoo comeback which included

Rethinking the Yahoo homepage

Consolidating Yahoo's plethora of social networks

Opening up Yahoo to third parties with a consistent platform similar to Facebook's

Revamping Yahoo's network infrastructure

If Yahoo! is reducing headcount by letting go of folks working on various next generation projects, this can't bode well for the future of the company given that success on the Web depends on constant innovation.

PS: To any ex-Yahoo's out there, if the kind of problems described in this post sound interesting to you, we're always hiring. Give me a holler. :)

Categories: Competitors/Web Companies

February 12, 2008

@ 04:00 AM

Comments [6]

To Mini-Microsoft: On Building Software Experiences that Delight Users

Mini-Microsoft has a blog post entitled Microsoft's Yahoo! Acquisition is Bold. And Dumb. which contains the following excerpt

To tell you the truth, if you had pulled me aside when I was in school, holding court in the computer science lab, and whispered in my ear ala The Graduate: "online ads..." I would have laughed my geek butt off.

So Google gets to have the joke on me, but for us to bet the company and build Microsoft's future foundation on ads revenue? WTF? As someone who considers themselves a citizen, not a consumer, I want to create software experiences that make people's lives delightful and better, not that sells them crap they don't need while putting them deeper into debt. I'm going to be in purgatory long enough as is.

I find this sentiment somewhat ironic coming from Mini-Microsoft. Microsoft’s bread and butter comes from selling software that people have to use not software that they want to use. In fact, you can argue that the fundamental problems the company has had in making traction in certain consumer-centric markets is that our culture is still influenced by selling to IT departments and developers (i.e. where features and checklists are important) as opposed to selling to consumers (i.e. where user experience is the most important thing).

Specifically, it is hard for me to imagine that there are more people in the world that think that whatever Microsoft product Mini works on has given them more delight or improved their lives better than Facebook, Flickr, Google, MySpace or Windows Live Messenger which happen to all be ad supported software. Thus it amusing to see him imply that ad-supported software is the antithesis of software that delights and improves peoples quality of life.

The way I see it, Jerry Yang is right that from the perspective of a user “You Always Have Other Options” when it comes free (ad supported), Web-based software which encourages applications to innovate in the user experience to differentiate themselves. It is no small wonder that we’ve seen more innovations in social applications and user interfaces in the world of free, Web-based applications than we’ve seen in the world of proprietary, commercial software. Something to think about the next time you decide to crap on ad supported Web apps because you think building commercial software is some sort of noble cause that results in perfect, customer delighting software, Mini.

Now playing: Snoop Doggy Dogg - Downtown Assassins

Categories: Life in the B0rg Cube

February 11, 2008

@ 12:59 AM

Comments [6]

You Can't Please Everyone: http://example.com != http://www.example.com

A couple of weeks ago I got a bug filed against me in RSS Bandit with the title Weak duplicate feed detection that had the following description

I already subscribed to "http://feeds.haacked.com/haacked" feed. Then while browsing the feed's homepage the feed autodetection gets refreshed with a "new feed found", url: "http://feeds.haacked.com/haacked/". It is not detected as a duplicate (ends with backslash) there and also not detected in the subscription wizard.

Even though I argued that there were lots of URLs that seem equivalent to end users that aren't according to the specs I decided to go ahead and fix the two common types of equivalence that trip up end users

http://www.example.com and http://www.example.com/ are not the same URL and
http://example.com and http://www.example.com aren't the same URL

Within a few hours of shipping this in version 1.6.0.2 of RSS Bandit, the bug reports have started coming in hard. There's a thread in our forums with the title URL Corruption After Adding a Feed with the following complaints

After I add the feed at: http://www.simple-talk.com/feed/
I get an error in the Error Log and when I restart RSS Bandit, the URL has been truncated to http://www.simple-talk.com

Possibly related problem:
When I add "http://www.amazonsellercommunity.com/forums/rss/rssmessages.jspa?forumID=22" then RSS Bandit slices and dices it into "http://amazonsellercommunity.com/forums/rss/rssmessages.jspa?forumID=22".
Sorry about the period outside the quotation mark. I wanted to make sure the URL was clear.

Same problem--have a feed, I've used it in the past with RSSBandit and was trying to enter it (tried the usual way first, then turned off autodiscover, then tried to change it via properties) but no matter what I do the www. in front disappears, and the feed doesn't work. http://www.antipope.org/charlie/blog-static/

Each of these is an example where the URL works when the domain is starts with "www" but doesn't if you take it out. This is definitely a case of from bad to worse. We went from the minor irritation of duplicate feeds not being detected when you subscribe to the same feed twice to users being unable to access certain feeds at all.

My apologies to everyone affected by this problem. I will be dialing back the canonicalization process to only treat trailing slashes as equivalent. Expect the installer to be refreshed within the next hour or so.

Now Playing: Young Jeezy feat. Swizz Beatz - Money In Da Bank (Remix)

Categories: RSS Bandit

February 9, 2008

@ 04:00 AM

Comments [6]

RSS Bandit Update: v1.6.0.2 Ships and Integrating with RSS feeds in Outlook/Exchange

I just realized that the current released version of RSS Bandit doesn’t have a working code name based on a character from the X-Men comic book. The previous 1.5.0.17 release was codenamed ShadowCat while the next release is codenamed Phoenix. Since the v1.6.0.x releases have been an interim releases on the road to Phoenix, I’ve decided to give them the codename Jean Grey retroactively. Now, on to the updates.

Jean Grey (v1.6.0.x) Update

The last bug fix release of RSS Bandit fixed a few bugs but introduced a couple of even worse bugs [depending on your perspective]. We’ve shipped version 1.6.0.2 that addresses the following issues

Application crashes with AccessViolationException on startup on Windows XP.
Application crashes and red 'X' shows in feed subscriptions window on Windows XP.
User's credentials are not used when accessing feeds via a proxy server leading to proxy errors when fetching feeds.
Duplicate feed URLs not detected if they differ by trailing slash or "www." in the host name
Application crashes when displaying an error dialog when a certificate issue is detected with a secure feed.

The first three issues are regressions that were introduced as part of refactoring the code and making it work better on Windows Vista. Yet another data point that shows that you can never have too many unit tests and that beta testing isn’t a bad idea either.

You can download the new release from http://downloads.sourceforge.net/rssbandit/RssBandit1.6.0.2_Installer.zip

Phoenix (v2.0) Update

I’m continuing with my plan to make RSS Bandit a desktop client for Web based feed readers like NewsGator Online and Google Reader. I’ve been slightly sidetracked by the realization that it would be pretty poor form for a Microsoft employee to write an application that synchronized with Google’s RSS reader but not any of Microsoft’s, even if it is a side project. My current coding project is to integrate with the Windows RSS platform which would allow one to manipulate the same set of feeds in RSS Bandit, Internet Explorer 7 and Outlook 2007. The good news is that with Outlook 2007 integration, you also get Exchange synchronization for free.

The bad news has been having to use the RSS reading features of Internet Explorer 7 and Outlook 2007 on a regular basis as a way of eating my own dog food with regards to the integration features. It’s pretty stunning to see not one but two RSS reading applications that assume “mark all items as read” or “delete all feeds” are actions that users never have to take. When you have people writing shell scripts to perform basic tasks in your application then it is a clear sign that somewhere along the line, the user experience for that particular set of features got the shaft.

I’m about half way through the integration after which I’ll continue with integrating with Google Reader and finally NewsGator Online using an Outlook + Exchange style model. While I’m working on this, both Oren and Torsten will be mapping out the rewrite of the graphical user interface using WPF. I’ll probably need to buy a book on XAML or something in the next few months so I can contribute to this effort. The only thing I’ve heard about any of the various books about the subject on the market is that they all seem to have had their forewords written by Don Box. Does anyone have recommendations on which book or website I should use to start learning XAML + WPF?

Now playing: Eminem - Sing For The Moment

Categories: RSS Bandit | Syndication Technology

February 9, 2008

@ 04:00 AM

Lessons from the O'Reilly Social Graph FOO Camp

This past weekend I attended the O’Reilly Social Graph FOO Camp and got to meet a bunch of folks who I’ve only “known” via their blogs or news stories about them. My favorite moment was talking to Mark Zuckerberg about stuff I think is wrong with Facebook and he stops for a second while I’m telling hin the story of naked pictures in my Facebook news feed then says “Dare? I read your blog”. Besides that my favorite part of the experience was learning new things from folks with different perspectives and technical backgrounds from me. Whether it was hearing different perspectives on the social graph problem from folks like Joseph Smarr and Blaine Cook, getting schooled on the various real-world issues around using OpenID/OAuth in practice from John Panzer and Eran Hammer-Lahav or getting to ask getting to Q&A Brad Fitzpatrick about the Google Social Graph API, it was a great learning experience all around.

There have been some ideas tumbling around in my head all week and I wanted to wait a few days before blogging to make sure I’d let the ideas fully marinate. Below are a few of the more important ideas I took away from the conference.

Social Network Discovery vs. Social Graph Portability

One of the most startling realizations I made during the conference is a lot of my assumptions about why developers of social applications are interested in what has been mistakenly called “social graph portability” were incorrect. I had assumed a lot of social networking sites that utilize the password anti-pattern to screen scrape a user’s Hotmail/Y! Mail/Gmail/Facebook address book were doing that as a way to get a list of the user’s friends to ~~spam~~ invite to join the service. However a lot of the folks I met at the SG FOO Camp made me realize how much of a bad idea this would be if they actually did that. Sending out a lot of spam would lead to negativity being associated with their service and brand (Plaxo is still dealing with a lot of the bad karma they generated from their spammy days).

Instead the way social applications often use the contacts from a person’s email address book is to satisfy the scenario in Brad Fitzpatrick’s blog post URLs are People, Too where he wrote

So you've just built a totally sweet new social app and you can't wait for people to start using it, but there's a problem: when people join they don't have any friends on your site. They're lonely, and the experience isn't good because they can't use the app with people they know.

I then thought of my first time using Twitter and Facebook, and how I didn’t consider them of much use until I started interacting with people I already knew that used those services. More than once someone has told me, “I didn’t really get why people like Facebook until I got over a dozen friends on the site”.

So the issue isn’t really about “portability”. After all, my “social graph” of Hotmail or Gmail contacts isn’t very useful on Twitter if none of my friends use the service. Instead it is about “discovery”.

Why is this distinction important? Let’s go back to the complaint that Facebook doesn’t expose email addresses in it’s API. The site actually hides all contact information from their API which is understandable. However since email addresses are also the only global identifiers we can rely on for uniquely identifying users on the Web, they are useful as way of being able to figure out if Carnage4Life on Twitter is actually Dare Obasanjo on Facebook since you can just check if they are backed by the same email address.

I talked to both John Panzer and Brad Fitzpatrick about how we could bridge this gap and Brad pointed out something really obvious which he takes advantage of in the Google Social Graph API. We can just share email addresses using foaf:mbox_sha1sum (i.e. cryptographical one-way hashes of email addresses). That way we all have a shared globally unique identifier for a user but services don’t have to reveal their user’s email addresses.

I wonder how we can convince the folks working on the Facebook platform to consider adding this as one of the properties returned by Users.getInfo?

You Aren’t Really My Friend Even if Facebook Says We Are

In a post entitled A proposal: email to URL mapping Brad Fitzpatrick wrote

People have different identifiers, of different security, that they give out depending on how much they trust you. Examples might include:

Homepage URL (very public)
Email address (little bit more secret)
Mobile phone number (perhaps pretty secretive)

When I think back to Robert Scoble getting kicked off of Facebook for screen scraping his friends’s email addresses and dates of birth into Plaxo, I wonder how many of his Facebook friends are comfortable with their personal contact information including email addresses, cell phone numbers and home addresses being utilized by Robert in this manner. A lot of people argued at SG FOO Camp that “If you’ve already agreed to share your contact info with me, why should you care whether I write it down on paper or download it into some social networking site?”.

That’s an interesting question.

I realized that one of my answers is that I actually don’t even want to share this info with the majority of the people in my Facebook friends list in the first place [as Brad points out]. The problem is that Facebook makes this a somewhat binary decision. Either I’m your “friend” and you get all my private personal details or I’ve faceslammed you by ignoring your friend request or only giving you access to my Limited Profile. I once tried to friend Andrew ‘Boz’ Bosworth (a former Microsoft employee who works at Facebook) and he told me he doesn’t accept friend requests from people he didn’t know personally so he ignored the friend request. I thought it was fucking rude even though objectively I realize it makes sense since it would mean I could view all his personal wall posts as well as his contact info. Funny enough, I always thought that it was a flaw in the site’s design that we had to have such an awkward social interaction.

I think the underlying problem again points to Facebook’s poor handling of multiple social contexts. In the real world, I separate my interactions with co-workers from that with my close friends or my family. For an application that wants to be the operating system underlying my social interactions, Facebook doesn’t do a good job of handling this fundamental reality of adult life.

Now playing: D12 - Revelation

Categories: Social Software | Trip Report

February 7, 2008

@ 04:00 AM

How to Probe Browser History Using Javascript in IE and Firefox

Niall Kennedy has a blog post entitled Sniff browser history for improved user experience where he describes a common-sense technique to test URLs against a Web browser’s visited page history. He writes

I first blogged about this technique almost two years ago but I will now provide even more details and example implementations.
...
A web browser such as Firefox or Internet Explorer will load the current user's browser history into memory and compare each link (anchor) on the page against the user's previous history. Previously visited links receive a special CSS pseudo-class distinction of :visited and may receive special styling.
...
Any website can test a known set of links against the current visitor's browser history using standard JavaScript.

Place your set of links on the page at load or dynamically using the DOM access methods.

Attach a special color to each visited link in your test set using finely scoped CSS.

Walk the evaluated DOM for each link in your test set, comparing the link's color style against your previously defined value.

Record each link that matches the expected value.

Customize content based on this new information (optional).

Each link needs to be explicitly specified and evaluated. The standard rules of URL structure still apply, which means we are evaluating a distinct combination of scheme, host, and path. We do not have access to wildcard or regex definitions of a linked resource.

Niall goes on to describe the common ways one can improve the user experience on a site using this technique. I’ve been considering using this approach to reduce the excess blog flair on my weblog. It doesn’t make much sense to show people a “submit to reddit” button if they don’t use reddit. The approach suggested in Niall’s article makes it possible for me to detect what sites a user visits and then only display relevant flair on my blog posts. Unfortunately neither of Niall’s posts on the topic provide example code which is why I’m posting this follow up to Niall’s post. Below is an HTML page that uses Javascript function to return which social bookmarking sites a viewer of a Web page actually uses based on their browser history.


<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">  
  <head>
    <title>What Social Bookmarking Sites Do You Use?</title>  
    <script type="text/javascript">
var bookmarking_sites = new Array("http://programming.reddit.com", "http://www.dzone.com", "http://www.digg.com", "http://del.icio.us", "http://www.stumbleupon.com", "http://ma.gnolia.com", "http://www.dotnetkicks.com/", "http://slashdot.org")


function DetectSites() {
  var testelem = document.getElementById("linktest");
  var visited_sites = document.getElementById("visited_sites");
  var linkColor; 
  var isVisited = false;

  for (var i = 0; i < bookmarking_sites.length; i++) {          
         var url2test = document.createElement("a");                
         url2test.href = bookmarking_sites[i];       
         url2test.innerHTML = "I'm invisible, you can't click me";        
         
	 testelem.appendChild(url2test); 

	 if(document.defaultView){ //Mozilla
           linkColor = document.defaultView.getComputedStyle(url2test,null).getPropertyValue("color");

	   if(linkColor == "rgb(100, 149, 237)"){
	     isVisited = true;
	   }
	 }else if(url2test.currentStyle){ //IE
	   if(url2test.currentStyle.color == "cornflowerblue"){
	     isVisited = true;
	   }
	 }

	 if (isVisited) {           
	 visited_sites.innerHTML = visited_sites.innerHTML + 
	  "<a href='" + url2test.href + "'>" + url2test.href + "</a><br>"
	 }         
	 testelem.removeChild(url2test);            
	 isVisited = false; 
  } 
}
      
    </script>
 <style type="text/css">
   p#linktest a:visited { color: CornflowerBlue }
 </style>
  </head>
  <body onload="DetectSites()">
    <b>Social Bookmarking Sites You've Visited</b>
    <p id="linktest" style="visibility:hidden" />
    <p id="visited_sites" />
  </body>
</html>

Of course, after writing the aforementioned code it occured to me run a Web search and I found that there are bits of code for doing this all over the Web in places like Jermiah Grossman’s blog (Firefox only) and GNUCITIZEN.

At least now I have it in a handy format; cut, paste and go.

Now all I need is some free time which in which to tweak my weblogt to start using the above function instead of showing people links to services they don’t use.

Now playing: Disturbed - Numb

Categories: Programming | Web Development

February 7, 2008

@ 04:00 AM

Can Someone Update My Wikipedia Entry?

I was recently reading a blog post in response to one of my posts and noticed that the author used my wikipedia entry as the primary resource to figure out who I am. The only problem with that is my Wikipedia entry is pretty outdated and quite scanty. Since it is poor form to edit your own Wikipedia entry I am at a quandary.

My resume is slightly more up to date, it is primarily missing descriptions of the stuff I worked on at Microsoft last year. Thus I saw two choices. I could either change the “About Me” link on my blog to point to my resume or I could implore some kind soul in my readership to update my entry in Wikipeda. I decided to start with the latter and if that doesn’t work out, I’ll be updating the “About Me” link to point to my resume.

Thanks in advance to anyone who takes the time to update my Wikipedia entry (not vandalize it Smile ).

Now playing: Ice Cube - Ghetto Vet

Categories: Personal

February 5, 2008

@ 05:09 PM

Targeted Ads on Facebook

First Round Capital ad asks 'Leaving Microsoft?'

Amusing. I wonder what kind of ads YHOO employees are getting in Facebook these days?

Now playing: Abba - Money, Money, Money

Categories: Social Software

February 3, 2008

@ 06:48 PM

Some Thoughts on the Google Social Graph API

On Friday of last week, brad Fitzpatrick posted an entry on the Google code blog entitled URLs are People, Too where he wrote

So you've just built a totally sweet new social app and you can't wait for people to start using it, but there's a problem: when people join they don't have any friends on your site. They're lonely, and the experience isn't good because they can't use the app with people they know. You could ask them to search for and add all their friends, but you know that every other app is asking them to do the same thing and they're getting sick of it. Or they tried address book import, but that didn't totally work, because they don't even have all their friends' email addresses (especially if they only know them from another social networking site!). What's a developer to do?

One option is the new Social Graph API, which makes information about the public connections between people on the Web easily available and useful
...
Here's how it works: we crawl the Web to find publicly declared relationships between people's accounts, just like Google crawls the Web for links between pages. But instead of returning links to HTML documents, the API returns JSON data structures representing the social relationships we discovered from all the XFN and FOAF. When a user signs up for your app, you can use the API to remind them who they've said they're friends with on other sites and ask them if they want to be friends on your new site.

I talked to Dewitt Clinton, Kevin Marks and Brad Fitzpatrick about this API at the O'Reilly Social Graph FOO Camp and I think it is very interesting. Before talking about the API, I did want to comment on the fact that this is the second time I've seen a Google employee ship something that implies that any developer can just write custom code to do data analysis on top of their search index (i.e. Google's copy of the World Wide Web) and then share that information with the world. The first time was Ian Hickson's work with Web authoring statistics. That is cool.

Now back to the Google Social Graph API. An illuminating aspect of my conversations at the Social Graph FOO Camp is that the scenario described by Brad where social applications would like to bootstrap the user's experience by showing them their friends who use the service is more important than the "invite my friends to join this new social networking site" for established social apps. This is interesting primarily because both goals are currently achieved by the current anti-pattern of requesting a user's username and password to their email service provider and screen scraping their address book. The social graph API attempts to eliminate the need for this ugly practice by providing a public API which will crawl a user's publicly articulated relationships and then providing an API that social apps can use to find the user's identities on other services as well as their relationships with other users on those services.

The API uses URIs as the primary identifier for users instead of email addresses. Of course, since there is often an intuitive way to convert a username to a URI (e.g. 'carnage4life on Twitter' => http://www.twitter.com/carnage4life), users simply need to provide a username instead of a URI.

So how would this work in the real world? So let's say I signed up for Facebook for the first time today. At this point my experience on the site would be pretty lame because I've made no friends so my news feed would be empty and I'm not connected to anyone I know on the site yet. Now instead of Facebook collecting the username and password for my email address provider to screen scrape my addres book (boo hiss) it shows a list of social networking sites and asks for just my username on those sites. On obtaining my username on Twitter, it maps that to a URI and passes that to the Social Graph API. This returns a list of people I'm following on Twitter with various identifiers for them, which Facebook in turn looks up in their user database then prompts me to add them as my friends on the site if any of them are Facebook users.

This is a good idea that gets around the proliferation of applications that collect usernames and passwords from users to try to access their social graph on other sites. However there are lots of practical problems with relying on this as an alternative to screen scraping and other approaches intended to discover a user's social graph including

many social networking sites don't expose their friend lists as FOAF or XFN
many friend lists on social networking sites are actually hidden from the public Web (e.g. most friend lists on Facebook) which is by design
many friend lists in social apps aren't even on the Web (e.g. buddy lists from IM clients, address books in desktop clients)

That said this is a good contribution to this space. Ideally, the major social networking sites and address book providers would also expose APIs that social applications can use to obtain a user's social graph without resorting to screen scraping. We are definitely working on that at Windows Live with the Windows Live Contacts API. I'd love to see other social software vendors step up and provide similar APIs in the coming months. That way everybody wins; our users, our applications and the entire ecosystem.

Now Playing: Playaz Circle - Duffle Bag Boy (feat. Lil Wayne)

Categories: Platforms | Social Software

February 3, 2008

@ 06:21 PM

Some Thoughts on the Movable Type Action Streams Plugin

A few days ago I got a Facebook message from David Recordon about Six Apart's release of the ActionStreams plugin. The meat of the announcement is excerpted below

Today, we're shipping the next step in our vision of openness -- the Action Streams plugin -- an amazing new plugin for Movable Type 4.1 that lets you aggregate, control, and share your actions around the web. Now of course, there are some social networking services that have similar features, but if you're using one of today's hosted services to share your actions it's quite possible that you're giving up either control over your privacy, management of your identity or profile, or support for open standards. With the Action Streams plugin you keep control over the record of your actions on the web. And of course, you also have full control over showing and hiding each of your actions, which is the kind of privacy control that we demonstrated when we were the only partners to launch a strictly opt-in version of Facebook Beacon. Right now, no one has shipped a robust and decentralized complement to services like Facebook's News Feed, FriendFeed, or Plaxo Pulse. The Action Streams plugin, by default, also publishes your stream using Atom and the Microformat hAtom so that your actions aren't trapped in any one service. Open and decentralized implementations of these technologies are important to their evolution and adoption, based on our experiences being involved in creating TrackBack, Atom, OpenID, and OAuth. And we hope others join us as partners in making this a reality.

This is a clever idea although I wouldn't compare it to the Facebook News Feed (what my social network is doing) it is instead a self hosted version of the Facebook Mini-Feed (what I've been doing). Although people have been doing this for a while by aggregating their various feeds and republishing to their blog (life streams?), I think this is the first time that a full fledged framework for doing this has been shipped as an out of the box solution.

Mark Paschal has a blog post entitled Building Action Streams which gives an overview of how the framework works. You define templates which contains patterns that should be matched in a feed (RSS/Atom) or in an HTML document and how to convert these matched elements into a blog post. Below is the template for extracting and republishing del.icio.us links extracted from the site's RSS feeds.

delicious:
    links:
        name: Links
        description: Your public links
        html_form: '[_1] saved the link <a href="[_2]">[_3]</a>'
        html_params:
            - url
            - title
        url: 'http://del.icio.us/rss/{{ident}}'
        identifier: url
        xpath:
            foreach: //item
            get:
                created_on: dc:date/child::text()
                title: title/child::text()
                url: link/child::text()

It reminds me a little of XSLT. I almost wondered why they just didn't use that until I saw that it also supports pattern matching HTML docs using Web::Scraper [and that XSLT is overly verbose and difficult to grok at first glance].

Although this is a pretty cool tool I don't find it interesting as a publishing tool. On the other hand, it's potential as a new kind of aggregator is very interesting. I'd love to see someone slap more UI on it and make it a decentralized version of the Facebook News feed. Specifically, if I could feed it a blogroll, have it use the Google Social Graph API to figure out the additional services that the people in my subscriptions have and then build a feed reader + news feed experience on top of it. That would be cool.

Come to think of it, this would be something interesting to experiment with in future versions of RSS Bandit.

Now Playing: Birdman - Pop Bottles (remix) (feat. Jim Jones & Fabolous)

Categories: Platforms | Social Software

February 3, 2008

@ 06:19 PM

MSFT + YHOO: Question for the Armchair Quarterbacks

Given that I work in Microsoft's online services group and have friends at Yahoo!, I obviously won't be writing down my thoughts on Microsoft's $44.6 billion bid for Yahoo. However I have been somewhat amused by the kind of ranting I've seen in the comments at Mini-Microsoft. Although the majority of the comments on Mini-Microsoft are critical of the bid, it is clear that the majority of the posters aren't very knowledgeable about Microsoft, it's competitors or the online business in general.

There were comments from people who are so out of it they think Paul Allen is a majority share holder of Microsoft. Or even better that Internet advertising will never impact newspaper, magazine or television advertising. I was also amused by the person that asked if anyone could name 2 or 3 successful acquisitions or technology purchases by Microsoft. I wonder if anyone would say the Bungie or Visio acquisitions didn't work out for the company. Or that the products that started off as NCSA Mosaic or Sybase SQL have been unsuccessful as Microsoft products.

My question for the armchair quarterbacks that have criticized this move in places like Mini-Microsoft is "If you ran the world's most successful software company, what would you do instead?"

PS: The ostrich strategy of "ignoring the Internet" and milking the Office + Windows cash cows doesn't count as an acceptable answer. Try harder than that.

Now Playing: Birdman - Hundred Million Dollars (feat. Rick Ross, Lil' Wayne & Young Jeezy)

Categories: Life in the B0rg Cube

February 1, 2008

@ 04:45 PM

Why You Shouldn't Use Wireless at Conferences

As I'm getting ready to miss the first Super Bowl weekend of my married life to attend the the O'Reilly Social Graph FOO Camp, I'm reminded that I should be careful about using wireless at the conference by this informative yet clueless post by Larry Dignan on ZDNet entitled Even SSL Gmail can get sidejacked which states

Sidejacking is a term Graham uses to describe his session hijacking hack that can compromise nearly all Web 2.0 applications that rely on saved cookie information to seamlessly log people back in to an account without the need to reenter the password. By listening to and storing radio signals from the airwaves with any laptop, an attacker can harvest cookies from multiple users and go in to their Web 2.0 application. Even though the password wasn’t actually cracked or stolen, possession of the cookies acts as a temporary key to gain access to Web 2.0 applications such as Gmail, Hotmail, and Yahoo. The attacker can even find out what books you ordered on Amazon, where you live from Google maps, acquire digital certificates with your email account in the subject line, and much more.

Gmail in SSL https mode was thought to be safe because it encrypted everything, but it turns out that Gmail’s JavaScript code will fall back to non-encrypted http mode if https isn’t available. This is actually a very common scenario anytime a laptop connects to a hotspot before the user signs in where the laptop will attempt to connect to Gmail if the application is opened but it won’t be able to connect to anything. At that point in time Gmail’s JavaScripts will attempt to communicate via unencrypted http mode and it’s game over if someone is capturing the data.

What’s really sad is the fact that Google Gmail is one of the “better” Web 2.0 applications out there and it still can’t get security right even when a user actually chooses to use SSL mode.

Although the blog post is about a valid concern, the increased likelihood of man-in-the-middle attacks when using unsecured or shared wireless networks, it presents it in the most ridiculous way possible. Man-in-the-middle attacks are a problem related to using computer networks, not something that is limited to the Web let alone Web 2.0 (whatever that means).

Now Playing: 50 Cent - Touch The Sky (Feat. Tony Yayo) (Prod by K Lassik)

Categories: Mindless Link Propagation

February 1, 2008

@ 03:59 PM

Slashdot on Microsoft's Bid on Yahoo

Obviously, this is the top story on all the tech news sites this morning. My favorite take so far has been from a post on Slashdot entitled Implications for open source which is excerpted below

A consolidation of the Microsoft and Yahoo networks could shift a massive amount of infrastructure from open source technologies to Microsoft platforms.Microsoft said that "eliminating redundant infrastructure and duplicative operating costs will improve the financial performance of the combined entity." Yahoo has been a major player in several open soruce projects. Most of Yahoo's infrastructure runs on FreeBSD, and the lead developer of PHP, Rasmus Lerdorf, works as an engineer at Yahoo. Yahoo has also been a major contributor to Hadoop, an open source technology for distributed computing. Data Center Knowledge [datacenterknowledge.com] has more on the infrastructure implications.

I listened in on the conference call and although the highlighted quote is paraphrased it is similar to what I remember raising my eyebrows at when I heard it over the phone, given my day job.

What a day to not be going into work...

Categories:

February 1, 2008

@ 12:52 PM