July 2, 2008
@ 01:56 PM

Jeff Atwood recently published two anti-XML rants in his blog entitled XML: The Angle Bracket Tax and Revisiting the XML Angle Bracket Tax. The source of his beef with XML and his recommendations to developers are excerpted below

Everywhere I look, programmers and programming tools seem to have standardized on XML. Configuration files, build scripts, local data storage, code comments, project files, you name it -- if it's stored in a text file and needs to be retrieved and parsed, it's probably XML. I realize that we have to use something to represent reasonably human readable data stored in a text file, but XML sometimes feels an awful lot like using an enormous sledgehammer to drive common household nails.

I'm deeply ambivalent about XML. I'm reminded of this Winston Churchill quote:

It has been said that democracy is the worst form of government except all the others that have been tried.

XML is like democracy. Sometimes it even works. On the other hand, it also means we end up with stuff like this:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" 
  SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
  <SOAP-ENV:Body>
    <m:GetLastTradePrice xmlns:m="Some-URI">
      <symbol>DIS</symbol>
    </m:GetLastTradePrice>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

You could do worse than XML. It's a reasonable choice, and if you're going to use XML, then at least learn to use it correctly. But consider:
  1. Should XML be the default choice?
  2. Is XML the simplest possible thing that can work for your intended use?
  3. Do you know what the XML alternatives are?
  4. Wouldn't it be nice to have easily readable, understandable data and configuration files, without all those sharp, pointy angle brackets jabbing you directly in your ever-lovin' eyeballs?

I don't necessarily think XML sucks, but the mindless, blanket application of XML as a dessert topping and a floor wax certainly does. Like all tools, it's a question of how you use it. Please think twice before subjecting yourself, your fellow programmers, and your users to the XML angle bracket tax. <CleverEndQuote>Again.</CleverEndQuote>

The question of if and when to use XML is one I am intimately familiar with given that I spent the first 2.5 years of my professional career at Microsoft working on the XML team as the “face of XML” on MSDN.

My problem with Jeff’s articles is that they take a very narrow view of how to evaluate a technology. No one should argue that XML is the simplest or most efficient technology to satisfy the uses it has been put to today. It isn’t. The value of XML isn’t in its simplicity or its efficiency. It is in the fact that there is a massive ecosystem of knowledge and tools around working with XML.

If I decide to use XML for my data format, I can be sure that my data will be consumable using a variety off-the-shelf tools on practically every platform in use today. In addition, there are a variety of tools for authoring XML, transforming it to HTML or text, parsing it, converting it to objects, mapping it to database schemas, validating it against a schema, and so on. Want to convert my XML config file into a pretty HTML page? I can use XSLT or CSS. Want to validate my XML against a schema? I have my choice of Schematron, Relax NG and XSD. Want to find stuff in my XML document? XPath and XQuery to the rescue. And so on.

No other data format hits a similar sweet spot when it comes to ease of use, popularity and breadth of tool ecosystem.

So the question you really want to ask yourself before taking on the “Angle Bracket Tax” as Jeff Atwood puts it, is whether the benefits of avoiding XML outweigh the costs of giving up the tool ecosystem of XML and the familiarity that practically every developer out there has with the technology? In some cases this might be true such as when deciding whether to go with JSON over XML in AJAX applications (I’ve given two reasons in the past why JSON is a better choice).  On the other hand, I can’t imagine a good reason to want to roll your own data format for office documents or application configuration files as opposed to using XML.

FURTHER READING
  • The XML Litmus Test - Dare Obasanjo provides some simple guidelines for determining when XML is the appropriate technology to use in a software application or architecture design. (6 printed pages)
  • Understanding XML - Learn how the Extensible Markup Language (XML) facilitates universal data access. XML is a plain-text, Unicode-based meta-language: a language for defining markup languages. It is not tied to any programming language, operating system, or software vendor. XML provides access to a plethora of technologies for manipulating, structuring, transforming and querying data. (14 printed pages)

Now Playing: Metallica - The God That Failed


 

Categories: XML

Late last week, the folks on the Google Data APIs blog announced that Google will now be supporting OAuth as the delegated authentication mechanism for all Google Data APIs. This move is meant to encourage the various online services that provide APIs that access a user’s data in the “cloud” to stop reinventing the wheel when it comes to delegated authentication and standardize on a single approach.

Every well-designed Web API that provides access to a customer’s data in the cloud utilizes a delegated authentication mechanism which allows users to grant 3rd party applications access to their data without having to give the application their username and password. There is a good analogy for this practice in the OAuth: Introduction page which is excerpted below

What is it For?

Many luxury cars today come with a valet key. It is a special key you give the parking attendant and unlike your regular key, will not allow the car to drive more than a mile or two. Some valet keys will not open the trunk, while others will block access to your onboard cell phone address book. Regardless of what restrictions the valet key imposes, the idea is very clever. You give someone limited access to your car with a special key, while using your regular key to unlock everything.

Everyday new website offer services which tie together functionality from other sites. A photo lab printing your online photos, a social network using your address book to look for friends, and APIs to build your own desktop application version of a popular site. These are all great services – what is not so great about some of the implementations available today is their request for your username and password to the other site. When you agree to share your secret credentials, not only you expose your password to someone else (yes, that same password you also use for online banking), you also give them full access to do as they wish. They can do anything they wanted – even change your password and lock you out.

This is what OAuth does, it allows the you the User to grant access to your private resources on one site (which is called the Service Provider), to another site (called Consumer, not to be confused with you, the User). While OpenID is all about using a single identity to sign into many sites, OAuth is about giving access to your stuff without sharing your identity at all (or its secret parts).

So every service provider invented their own protocol to do this, all of which are different but have the same basic components. Today we have Google AuthSub, Yahoo! BBAuth, Windows Live DelAuth, AOL OpenAuth, the Flickr Authentication API, the Facebook Authentication API and others. All different, proprietary solutions to the same problem.

This ends up being problematic for developers because if you want to build an application that talks to multiple services you not only have to deal with the different APIs provided by these services but also the different authorization/authentication models they utilize as well. In a world where “social aggregation” is becoming more commonplace with services like Plaxo Pulse & FriendFeed and more applications are trying to bridge the desktop/cloud divide like OutSync and RSS Bandit, it sucks that these applications have to rewrite the same type of code over and over again to deal with the basic task of getting permission to access a user’s data. Standardizing on OAuth is meant to fix that. A number of startups like Digg & Twitter as well as major players like Yahoo and Google have promised to support it, so this should make the lives of developers easier.

Of course, we still have work to do as an industry when it comes to the constant wheel reinvention in the area of Web APIs. Chris Messina points to another place where every major service provider has invented a different proprietary protocol for doing the same task in his post Inventing contact schemas for fun and profit! (Ugh) where he writes

And then there were three
...
Today, Yahoo!
announced the public availability of their own Address Book API.

However, I have to lament yet more needless reinvention of contact schema. Why is this a problem? Well, as I pointed out about Facebook’s approach to developing their own platform methods and formats, having to write and debug against yet another contact schema makes the “tax” of adding support for contact syncing and export increasingly onerous for sites and web services that want to better serve their customers by letting them host and maintain their address book elsewhere.

This isn’t just a problem that I have with Yahoo!. It’s something that I encountered last November with the SREG and proposed Attribute Exchange profile definition. And yet again when Google announced their Contacts API. And then again when Microsoft released theirs! Over and over again we’re seeing better ways of fighting the password anti-pattern flow of inviting friends to new social services, but having to implement support for countless contact schemas. What we need is one common contacts interchange format and I strongly suggest that it inherit from vcard with allowances or extension points for contemporary trends in social networking profile data.

I’ve gone ahead and whipped up a comparison matrix between the primary contact schemas to demonstrate the mess we’re in.

Kudos to the folks at Google for trying to force the issue when it comes to standardizing on a delegated authentication protocol for use on the Web. However there are still lots of places across the industry where we speak different protocols and thus incur a needless burden on developers when a single language might do. It would be nice to see some of this unnecessary redundancy eliminated in the future.

Now Playing: G-Unit - I Like The Way She Do It


 

Categories: Platforms | Web Development

Recently I’ve been bumping into more and more people who’ve either left Google to come to Microsoft or got offers from both companies and picked Microsoft over Google. I believe this is part of a larger trend especially since I’ve seen lots of people who left the company for “greener pastures” return in the past year (at least 8 people I know personally have rejoined) . However in this blog post I’ll stick to talking about people who’ve chosen Microsoft over Google.

First of all there’s the post by Sergey Solyanik entitled Back to Microsoft where he primarily gripes about the culture and lack of career development at Google, some key excerpts are

Last week I left Google to go back to Microsoft, where I started this Monday (and so not surprisingly, I was too busy to blog about it)

So why did I leave?

There are many things about Google that are not great, and merit improvement. There are plenty of silly politics, underperformance, inefficiencies and ineffectiveness, and things that are plain stupid. I will not write about these things here because they are immaterial. I did not leave because of them. No company has achieved the status of the perfect workplace, and no one ever will.

I left because Microsoft turned out to be the right place for me.

Google software business is divided between producing the "eye candy" - web properties that are designed to amuse and attract people - and the infrastructure required to support them. Some of the web properties are useful (some extremely useful - search), but most of them primarily help people waste time online (blogger, youtube, orkut, etc)

This orientation towards cool, but not necessarilly useful or essential software really affects the way the software engineering is done. Everything is pretty much run by the engineering - PMs and testers are conspicuously absent from the process. While they do exist in theory, there are too few of them to matter.

On one hand, there are beneficial effects - it is easy to ship software quickly…On the other hand, I was using Google software - a lot of it - in the last year, and slick as it is, there's just too much of it that is regularly broken. It seems like every week 10% of all the features are broken in one or the other browser. And it's a different 10% every week - the old bugs are getting fixed, the new ones introduced. This across Blogger, Gmail, Google Docs, Maps, and more

The culture part is very important here - you can spend more time fixing bugs, you can introduce processes to improve things, but it is very, very hard to change the culture. And the culture at Google values "coolness" tremendously, and the quality of service not as much. At least in the places where I worked.

The second reason I left Google was because I realized that I am not excited by the individual contributor role any more, and I don't want to become a manager at Google.

The Google Manager is a very interesting phenomenon. On one hand, they usually have a LOT of people from different businesses reporting to them, and are perennially very busy.

On the other hand, in my year at Google, I could not figure out what was it they were doing. The better manager that I had collected feedback from my peers and gave it to me. There was no other (observable by me) impact on Google. The worse manager that I had did not do even that, so for me as a manager he was a complete no-op. I asked quite a few other engineers from senior to senior staff levels that had spent far more time at Google than I, and they didn't know either. I am not making this up!

Sergey isn’t the only senior engineer I know who  has contributed significantly to Google projects and then decided Microsoft was a better fit for him. Danny Thorpe who worked on Google Gears is back at Microsoft for his second stint working on developer technologies related to Windows Live.  These aren’t the only folks I’ve seen who’ve decided to make the switch from the big G to the b0rg, these are just the ones who have blogs that I can point at.

Unsurprisingly, the fact that Google isn’t a good place for senior developers is also becoming clearly evident in their interview processes. Take this post from Svetlin Nakov entitled Rejected a Program Manager Position at Microsoft Dublin - My Successful Interview at Microsoft where he concludes

My Experience at Interviews with Microsoft and Google

Few months ago I was interviewed for a software engineer in Google Zurich. If I need to compare Microsoft and Google, I should tell it in short: Google sux! Here are my reasons for this:

1) Google interview were not professional. It was like Olympiad in Informatics. Google asked me only about algorithms and data structures, nothing about software technologies and software engineering. It was obvious that they do not care that I had 12 years software engineering experience. They just ignored this. The only think Google wants to know about their candidates are their algorithms and analytical thinking skills. Nothing about technology, nothing about engineering.

2) Google employ everybody as junior developer, ignoring the existing experience. It is nice to work in Google if it is your first job, really nice, but if you have 12 years of experience with lots of languages, technologies and platforms, at lots of senior positions, you should expect higher position in Google, right?

3) Microsoft have really good interview process. People working in Microsoft are relly very smart and skillful. Their process is far ahead of Google. Their quality of development is far ahead of Google. Their management is ahead of Google and their recruitment is ahead of Google.

Microsoft is Better Place to Work than Google

At my interviews I was asking my interviewers in both Microsoft and Google a lot about the development process, engineering and technologies. I was asking also my colleagues working in these companies. I found for myself that Microsoft is better organized, managed and structured. Microsoft do software development in more professional way than Google. Their engineers are better. Their development process is better. Their products are better. Their technologies are better. Their interviews are better. Google was like a kindergarden - young and not experienced enough people, an office full of fun and entertainment, interviews typical for junior people and lack of traditions in development of high quality software products.

Based on my observations, I have theory that Google’s big problem is that the company hasn’t realized that it isn’t a startup anymore. This disconnect between the company’s status and it’s perception of itself manifests in a number of ways

  1. Startups don’t have a career path for their employees. Does anyone at Facebook know what they want to be in five years besides rich? However once riches are no longer guaranteed and the stock isn’t firing on all cylinders (GOOG is underperforming both the NASDAQ and DOW Jones industrial average this year) then you need to have a better career plan for your employees that goes beyond “free lunches and all the foosball you can handle".

  2. There is no legacy code at a startup. When your code base is young, it isn’t a big deal to have developers checking in new features after an overnight coding fit powered by caffeine and pizza. For the most part, the code base shouldn’t be large enough or interdependent enough for one change to cause issues. However it is practically a law of software development that the older your code gets the more lines of code it accumulates and the more closely coupled your modules become. This means changing things in one part of the code can have adverse effects in another. 

    As all organizations mature they tend to add PROCESS. These processes exist to insulate the companies from the mistakes that occur after a company gets to a certain size and can no longer trust its employees to always do the right thing. Requiring code reviews, design specifications, black box & whitebox & unit testing, usability studies, threat models, etc are all the kinds of overhead that differentiate a mature software development shop from a “fly by the seat of your pants” startup. However once you’ve been through enough fire drills, some of those processes don’t sound as bad as they once did. This is why senior developers value them while junior developers don’t since the latter haven’t been around the block enough.

  3. There is less politics at a startup. In any activity where humans have to come together collaboratively to achieve a goal, there will always be people with different agendas. The more people you add to the mix, the more agendas you have to contend with. Doing things by consensus is OK when you have to get consensus from two or three people who sit in the same hallway as you. It’s a totally different ball game when you need to gain it from lots of people from across a diverse company working on different projects in different regions of the world who have different perspectives on how to solve your problems. At Google, even hiring an undergraduate candidate has to go through several layers of committees which means hiring managers need to possess some political savvy if they want to get their candidates approved.  The founders of Dodgeball quit the Google after their startup was acquired after they realized that they didn’t have the political savvy to get resources allocated to their project.

The fact that Google is having problems retaining employees isn't news, Fortune wrote an article about it just a few months ago. The technology press makes it seem like people are ditching Google for hot startups like FriendFeed and Facebook. However the truth is more nuanced than that. Now that Google is just another big software company, lots of people are comparing it to other big software companies like Microsoft and finding it lacking.

Now Playing: Queen - Under Pressure (feat. David Bowie)


 

Categories: Life in the B0rg Cube

Last week TechCrunch UK wrote about a search startup that utilizes AI/Semantic Web techniques named True Knowledge. The post entitled VCs price True Knowledge at £20m pre-money. Is this the UK’s Powerset?  stated

The chatter I’m hearing is that True Knowledge is being talked about in hushed tones, as if it might be the Powerset of the UK. To put that in context, Google has tried to buy the Silicon Valley search startup several times, and they have only launched a showcase product, not even a real one. However, although True Knowledge and Powerset are similar, they are different in significant ways, more of which later.
...
Currently in private beta, True Knowledge says their product is capable of intelligently answering - in plain English - questions posed on any topic. Ask it if Ben Affleck is married and it will come back with "Yes" rather than lots of web pages which may or may not have the answer (don’t ask me!).
...
Here’s why the difference matters. True Knowledge can infer answers that the system hasn’t seen. Inferences are created by combining different bits of data together. So for instance, without knowing the answer it can work out how tall the Eiffel Tower is by inferring that it is shorter that the Empire State Building but higher than St Pauls Cathedral.
...
AI software developer and entrepreneur William Tunstall-Pedoe is the founder of True Knowledge. He previously developed a technology that can solve a commercially published crossword clues but also explain how the clues work in plain English. See the connection?

The scenarios described in the TechCrunch write up should sound familiar to anyone who has spent any time around fans of the Semantic Web. Creating intelligent agents that can interrogate structured data on the Web and infer new knowledge has turned out to  be easier said than done because for the most part content on the Web isn't organized according to the structure of the data. This is primarily due to the fact that HTML is a presentational language. Of course, even if information on the Web was structured data (i.e. idiomatic XML formats) we still need to build machinary to translate between all of these XML formats.

Finally, in the few areas on the Web where structured data in XML formats is commonplace such as Atom/RSS feeds for blog content, not a lot has been done with this data to fulfill the promise of the Semantic Web.

So if the Semantic Web is such an infeasible utopia, why are more and more search startups using that as the angle from which they will attack Google's dominance of Web search? The answer can be found in Bill Slawski's post from a year ago entitled Finding Customers Through Anti-Commercial Queries where he wrote

Most Queries are Noncommercial

The first step might be to recognize that most queries conducted by people at search engines aren't aimed at buying something. A paper from the WWW 2007 held this spring in Banff, Alberta, Canada, Determining the User Intent of Web Search Engine Queries, provided a breakdown of the types of queries that they were able to classify.

Their research uncovered the following numbers: "80% of Web queries are informational in nature, with about 10% each being navigational and transactional." The research points to the vast majority of searches being conducted for information gathering purposes. One of the indications of "information" queries that they looked for were searches which include terms such as: “ways to,” “how to,” “what is.”

Although the bulk of the revenue search engines make is from people performing commercial queries such as searching for "incredible hulk merchandise", "car insurance quotes" or "ipod prices", this is actually a tiny proportion of the kinds of queries people want answered by search engines. The majority of searches are about the five Ws (and one H) namely "who", "what", "where", "when", "why" and "how". Such queries don't really need a list of Web pages as results, they simply require an answer. The search engine that can figure out how to always answer user queries directly on the page without making the user click on half a dozen pages to figure out the answer will definitely have moved the needle when it comes to the Web search user experience.

This explains why scenarios that one usually associates with AI and Semantic Web evangelists are now being touted by the new generation of "Google-killers". The question is whether knowledge inference techniques will prove to be more effective than traditional search engine techniques when it comes to providing the best search results especially since a lot of the traditional search engines are learning new tricks.

Now Playing: Bob Marley - Waiting In Vain


 

Categories: Technology

 At the end of February of this year, I wrote a post entitled No Contest: FriendFeed vs. The Facebook News Feed where I argued that it would be a two month project for an enterprising developer at Facebook to incorporate all of the relevant features of FriendFeed that certain vocal bloggers had found so enticing. Since then we've had two announcements from Facebook

From A new way to share with friends on April 15th

we've introduced a way for you to import activity from other sites into your Mini-Feed (and into your friends' News Feeds).

The option to import stories from other sites can be found via the small "Import" link at the top of your Mini-Feed. Only a few sites—Flickr, Yelp, Picasa, and del.icio.us—are available for importing at the moment, but we'll be adding Digg and other sites in the near future. These stories will look just like any other Mini-Feed stories, and will hopefully increase your ability to share information with the people you care about.

From on We're Open For Commentary on June 25th (Yesterday)

In the past, you've been able to comment on photos, notes and posted items, but if there was something else on your friend's profile—an interesting status, or a cool new friendship—you'd need to send a message or write a Wall post to talk about it. But starting today, you can comment on your friends' Mini-Feed stories right from their profile.

Now you can easily converse around friends' statuses, application stories, new friendships, videos, and most other stories you see on their profile. Just click on the comment bubble icon to write a comment or see comments other people have written.

It took a little longer than two months but it looks like I was right. For some reason Facebook isn't putting the comment bubbles in the news feed but I assume that is only temporary and they are trying it out in the mini-feed first.

FriendFeed has always seemed to me to be a weird concept for a stand alone application. Why would I want to go to whole new site and create yet another friend list just to share what I'm doing on the Web with my friends? Isn't that what social networking sites are for? It just sounds so inconvenient, like carrying around a pager instead of a mobile phone.

As I said in my original post on the topic, all FriendFeed has going for it is the community that has built around the site. Especially since the functionality it provides can be easily duplicated and actually fits better as a feature of an existing social networking site. The question is whether that community is the kind that will grow into making it a mainstream success or whether it will remain primarily a playground for Web geeks despite all the hype (see del.icio.us as an example of this). So far, the chance of the latter seems strong. For comparison, consider the growth curve of Twitter against that of FriendFeed on Google Trends and Alexa.  Which seems more likely to one day have the brand awareness of a Flickr or a Facebook?

Now Playing: Bob Marley - I Shot The Sheriff


 

Categories: Social Software

Jason Kincaid over at TechCrunch has a blog post entitled Microsoft’s First Step In Accepting OpenID SignOns - HealthVault where he writes

Over 16 months after first declaring its support for the OpenID authentication platform, Microsoft has finally implemented it for the first time, allowing for OpenID logins on its Health Vault medical site. Unfortunately, Health Vault will only support authentication from two OpenID providers: Trustbearer and Verisign. Whatever happened to the Open in OpenID?

The rationale behind the limited introduction is that health is sensitive, so access should be limited to the few, most trusted OpenID providers. It certainly makes sense, but it also serves to underscore one of the problems inherent to OpenID: security
...
But it seems that the platform itself may be even more deserving of scrutiny. What good is a unified login when its default form will only be accepted on the least private and secure sites?

A while back I mentioned that the rush to slap "Open" in front of every new spec written by a posse of Web companies had created a world where "Open" had devolved into a PR marketing term with no real meaning since the term was being used too broadly to define different sorts of "openness".  In the above case, the "open" in OpenID has never meant that every service that accepts OpenIDs needs to accept them from every OpenID provider.

Simon Willison, who's been a key evangelist of OpenID, has penned an insightful response to Jason Kincaid's article in his post The point of “Open” in OpenID which is excerpted below

TechCrunch report that Microsoft are accepting OpenID for their new HealthVault site, but with a catch: you can only use OpenIDs from two providers: Trustbearer (who offer two-factor authentication using a hardware token) and Verisign. "Whatever happened to the Open in OpenID?", asks TechCrunch’s Jason Kincaid.

Microsoft’s decision is a beautiful example of the Open in action, and I fully support it.

You have to remember that behind the excitement and marketing OpenID is a protocol, just like SMTP or HTTP. All OpenID actually provides is a mechanism for asserting ownership over a URL and then “proving” that assertion. We can build a pyramid of interesting things on top of this, but that assertion is really all OpenID gives us (well, that and a globally unique identifier). In internet theory terms, it’s a dumb network: the protocol just concentrates on passing assertions around; it’s up to the endpoints to set policies and invent interesting applications.
...
HealthVault have clearly made this decision due to security concerns—not over the OpenID protocol itself, but the providers that their users might choose to trust. By accepting OpenID on your site you are outsourcing the security of your users to an unknown third party, and you can’t guarantee that your users picked a good home for their OpenID. If you’re a bank or a healthcare provider that’s not a risk you want to take; whitelisting providers that you have audited for security means you don’t have to rule out OpenID entirely.

The expectation that services would have to create a white list of OpenID providers is not new thinking. Tim Bray blogged as much in his post on OpenID over year ago where he speculated that there would eventually be a market for rating OpenID providers so companies wouldn't have to individually audit each OpenID provider before deciding which ones to add to their white list.  

As more companies decide to accept OpenID as a login mechanism on their services, I suspect that either the community or some company will jump in to fill the niche that Tim Bray speculated about in his post. I can't imagine that it is fun having to audit all of the OpenID providers as part of deciding how people will login to your site nor does it make sense that everyone who plans to support OpenID security audits the same list of services. That sounds like a ton of wasted man hours when it can just be done once then the results shared by all.

Now Playing: Big & Rich - Save a Horse (Ride a Cowboy)


 

Categories: Web Development

Over the past few months there have been a number of posts about how aggregators like FriendFeed are causing bloggers to "lose control of the conversation". Louis Gray captured some of the blogger angst about this topic in his Should Fractured Feed Reader Comments Raise Blog Owners' Ire? where he wrote

While the discussion around where a blog's comments should reside has raised its head before, especially around services like FriendFeed, (See: Sarah Perez of Read Write Web: Blog Comments Still Matter) it flared up again this afternoon when I had (innocently, I thought) highlighted how one friend's blog post from earlier in the week was getting a lot of comments, and had become the most popular story on Shyftr, a next-generation RSS feed reader that enables comments within its service.

While I had hoped the author (Eric Berlin of Online Media Cultist, who I highlighted on Monday and like quite a bit) would be pleased to see his post had gained traction, the reaction was not what I had expected. He said he was uneasy about seeing his posts generate activity and community for somebody else. Another FriendFeed user called it "content theft" and said "if they ever pull my feed and use it there, they can expect to get hit with a DMCA take-down notice". (See the discussion here)

Surprisingly [at least to me] these aren't the only instances where people have become upset because there are more comments happening in Friendfeed than on their post. Colin Walker tells the the story of Rob La Gesse who signed up for FriendFeed only to cancel his account because his "friends" on the site preferred commenting on FriendFeed than on his blog. 

I suspect that a lot of the people expressing outrage are new to blogging which is why they expect that their blog comments are the be all and end all of conversation about their blog posts. This has never been the case. For one, blogs have had to contend with social news sites like Slashdot, Digg and reddit where users can submit stories and then comment on them. A post may have a handful of comments on the original blog but generate dozens or hundreds of responses on a social news site. For example, I recently wrote about functional programming C# 3.0 and while there were less than 10 comments on my blog there were over 150 comments in the discussion of the post on reddit.

Besides social news sites, there are other bloggers to consider. People with their own blogs often prefer blogging a response to your post instead of leaving a comment on the original post. This is the reason services like Technorati and technologies like Trackback were invented. Am I "stealing the conversation" from Louis Gray's post by writing this blog post in response to his instead of leaving a comment?

Then there's email, IM and other forms of active sharing. I've lost count of the amount of times that people have told me that one of my blog posts was circulated around their group and a lively conversation ensued. Quite often, the referenced post has no comments.

In short, bloggers aren't losing control of the conversation due to services like FriendFeed because they never had it in the first place. You can't lose what you don't have.

When it comes to FriendFeed there are two things I like about the fact that they enable comments on items. The first is that it is good for their users since it provides a place to chat about content they find on the Web without having to send out email noise (i.e. starting conversations via passive instead of active sharing). The second is that it is good for FriendFeed because it builds network effects and social lock-in into their product. Sure, anyone can aggregate RSS feeds from Flickr/del.icio.us/YouTube/etc  (see SocialThing, Facebook Import, Grazr, etc) but not everyone has the community that has been built around the conversations on FriendFeed. 

Now Playing: Lloyd Banks - Born Alone, Die Alone


 

Categories: Social Software

After talking about it for months, we finally have an alpha version of the next release of RSS Bandit codenamed Phoenix. There are two key new features in this release. The first is that we've finally finished off the last of the features related to downloading podcasts. If you go to View->Download Manager you can now view and manage pending downloads of podcasts/enclosures as shown in the screen shot below.

The second feature is one I'm sure will be appreciated by peoples who like reading their feeds from multiple computers but still want a desktop-based feed reader for a variety of reasons (e.g. reading feeds from a corporate intranet while roaming your feeds from the public Web). With this feature you can have multiple feed lists which are synchronized from a Web-based feed reader such as Google Reader or NewsGator Online while still keeping some feeds local. All you need to do is go to File->Syncronize Feeds and follow the steps as shown in the screen shots below

after selecting the option to synchronize feeds you are taken to a wizard which gives some options of feed sources to synchronize with and obtains your user credentials if necessary.

once you have given the wizard your information your feed list is synchronized and every action you make in RSS Bandit such as subscribing to new feeds, unsubscribing from existing feeds, renaming feeds or marking items as read is reflected in your Web-based feed reader of choice. The experience is intended to mirror the experience of using a desktop mail client in concert with a Web-based email service and it should work as expected.

in addition you can also use Google Reader's sharing feature (or NewsGator's clipping feature) directly from RSS Bandit as shown below

I'm sure you're wondering where you can download this version of RSS Bandit and try it out for yourself. Get it here. There are two files in the installer package, I suggest running setup.exe because that validates that you have the correct prerequisites to run the application and tells you where to get them otherwise.

Please note that this is alpha quality software so although it is intended to be fully functional (except for searching within your subscriptions) you should expect it to be buggy. If you have any problems feel free to file a bug on SourceForge or ask a question on our forum

PS: If you are an existing RSS Bandit user I'd suggest backing up your application data folder just in case. On a positive note, we've fixed dozens of bugs from previous versions.

Now Playing: Young Buck - Puff Puff Pass (ft. Ky-Mani Marley)


 

Categories: RSS Bandit

When adding new features that dramatically change how users interact with your site, it is a good practice to determine up front if your service can handle these new kinds of interactions so you don't end up constantly disabling features due to the high load they incur on your site. 

A couple of weeks ago, the developers at Facebook posted an entry about the architecture of Facebook Chat (it's written in Erlang, OMFG!!!) and I was interested to see the discussion of how they tested the scalability of the feature to ensure they didn't create negative first impressions when they rolled it due to scale issues or impact the availability of their main site. The relevant part of the post is excerpted below

Ramping up:

The secret for going from zero to seventy million users overnight is to avoid doing it all in one fell swoop. We chose to simulate the impact of many real users hitting many machines by means of a "dark launch" period in which Facebook pages would make connections to the chat servers, query for presence information and simulate message sends without a single UI element drawn on the page. With the "dark launch" bugs fixed, we hope that you enjoy Facebook Chat now that the UI lights have been turned on.

The approach followed by Facebook encapsulates some of the industry best practices when it comes to rolling out potentially expensive new features on your site. The dark launch is a practice that we've used when launching features for Windows Live in the past. During a dark launch, the feature is enabled on the site but not actually shown in the user interface. The purpose of this is to monitor if the site can handle the load of the feature during day to day interactions without necessarily exposing the feature to end users in case it turns out the answer is no.

An example of a feature that could have been be rolled out using a dark launch is the replies tab on Twitter. A simple way to implement the @replies feature is to create a message queue (i.e. an inbox) for the user that contains all the replies they have been sent. To test if this approach was scalable, the team could have built this feature and had messages going into user's inboxes without showing the Replies tab in the UI. That way they could test the load on their message queue and fix bugs in it based on real user interactions without their users even knowing that they were load testing a new feature. If it turned out that they couldn't handle the load or they needed to beef up their message queuing infrastructure they could disable the feature, change the implementation and retest quickly without exposing their flaws to users.

The main problem with a dark launch is that it ignores the fact that users often use social features a lot differently than is expected by site developers. In the specific case of the Replies tab in Twitter, it is quite likely that the usage of "@username" replies would increase by a lot once the tab was introduced since the feature increased the chance that the recipient would see the response compared to a regular tweet. So the load from the dark launch would not be the actual load from having the feature enabled. So the next step is to introduce the feature to a subset of your users using a gradual ramp up approach.

During a gradual ramp up, you release the feature set to small groups of users preferably in a self contained group so you can see how users actually interact with the feature for real without bringing down your entire site if it turns out that their usage patterns greatly exceed your expectations. This is one of the reasons why Facebook Chat was gradually exposed to users from specific networks before being enabled across the entire site even though the feature had already been dark launched.

Another common practice for limiting the impact of certain features from impacting your core user experience is to isolate your features from each other as much possible. Although this should be a last resort, it is better if one or two features of your site do not work than if the entire site is down. The Twitter folks found this out the hard way when it turned out that traffic from their instant messaging interface was bringing down the site and finally resorted to disabling the ability to update Twitter via IM until further notice.  Ideally your site's APIs and Web services should be isolated from your core features so that even if the APIs are going over capacity it doesn't mean your entire site is down. Instead, it would mean that at worst access to data via your site's APIs was unavailable. The same applies for ancillary features like the Replies tab in Twitter or Facebook Chat, if that service was overloaded it shouldn't mean the entire site should be unavailable.  This is one area where following the principles of service oriented architecture can save your Web site a lot of pain.

Now Playing: Earth, Wind & Fire - Let's Groove


 

Categories: Web Development

In the past year I've spent a lot of time thinking about hiring due to a recent surge in the amount of interviews I've participated in as well as a surge in the number of folks I know who've decided to "try new things". One thing I've noticed is that software companies and teams within large software companies like Microsoft tend to fall into two broad camps when it comes to hiring. There are the teams/companies that seem to attract tons of smart, superstar programmers like a refrigerator door attracts magnets and then there those that use the beachcomber technique of sifting through tons of poorly written resumes hoping to find someone valuable but often ending up with people who seem valuable but actually aren't (aka good at interviewing, lousy at actually getting work done).

Steve Yegge talks about this problem in his post Done, and Gets Things Smart which is excerpted below

The "extended interview" (in any form) is the only solution I've ever seen to the horrible dilemma, How do you hire someone smarter than you? Or even the simpler problem, How do you identify someone who's actually Smart, and Gets Things Done? Interviews alone just don't cut it.
Let me say it more directly, for those of you who haven't taken this personally yet: you can't do what Joel is asking you to do. You're not qualified. The Smart and Gets Things Done approach to interviewing will only get you copies of yourself, and the work of Dunning and Kruger implies that if you hire someone better than you are, then it's entirely accidental.
...
So let's assume you're looking at the vast ocean of programmers, all of whom are self-professed superstars who've gotten lots of "stuff" done, and you want to identify not the superstars, but the super-heroes. How do you do it? Well, Brian Dougherty of Geoworks did it somehow. Jeff Bezos did it somehow. Larry and Sergey did it somehow. I'm willing to bet good money that every successful tech company out there had some freakishly good seed engineers.
...
You can only find Done, and Gets Things Smart people in two ways, and one of them I still don't understand very well. The first way is to get real lucky and have one as a coworker or classmate. You work with them for a few years and come to realize they're just cut from a finer cloth than you and your other unwashed cohorts. You may be friends with some of them, which helps with the recruiting a little, but not necessarily. The important thing is that you recognize them, even if you don't know what makes them tick.
...
I think Identification Approach #2, and this is the one I don't understand very well, is that you "ask around". You know. You manually perform the graph build-and-traversal done by the Facebook "Smartest Friend" plug-in, where you ask everyone to name the best engineer they know, and continue doing that until it converges.

This jibes with my experience watching various software startups and knowing the history of various teams at Microsoft over the past few years. The products that seem to have hired the most phenomenal programmers and have achieved great things often start off with some person trying to hire the smartest person they know or knew from past jobs (Approach #1). Those people in turn try to attract the smartest people they've known and that happens recursively (Approach #2).

I remember a few years ago chatting with a coworker who mentioned that some Harvard-based startup was hiring super smart, young Harvard alumni from Microsoft and a couple of other technology companies at a rapid clip. It seems people were recommended by their friends at the startup and those folks would in turn come back to Microsoft/Google/etc to convince their ex-Harvard chums to come join in the fun. It turns out that startup was Facebook and since then the company has impressed the world with its output. Google used to have a similar approach to hiring until the company grew too big and had to start utilizing the beachcomber technique as well. I've also seen this technique work successfully for a number of teams at Microsoft.

Although this technique sounds unrealistic, it actually isn't as difficult as it once was thanks to the Web and social networking sites. It is now quite easy for people to stay in touch with or reconnect with people they knew from previous jobs or back in their school days. Thus the big barrier to adopting this approach to hiring isn't that employees won't have any recommendations for super-smart people they'd love to work with if given the chance.  The real barrier is that most employers don't know how to court potential employees or even worse don't believe that they have to do so.  Instead they expect people to want to work for them which means they'll get a flood of awful resumes, put a bunch of candidates through the flawed interview process only to eventually get tired of the entire charade and finally hire the first warm body to show up after they reach their breaking point. All of this could be avoided if they simply leverage the social networks of their best employees. Unfortunately, common sense is never as common as you expect it to be.

Now Playing: Soundgarden - Jesus Christ Pose


 

Categories: Life in the B0rg Cube