Dare Obasanjo's weblog

March 31, 2009

@ 01:19 PM

Seeking the perfect website: Design vs. Experimentation

Last week, Douglas Bowman posted a screed against making web design based strictly on usage data. In a post entitled Goodbye Google, he wrote

With every new design decision, critics cry foul. Without conviction, doubt creeps in. Instincts fail. “Is this the right move?” When a company is filled with engineers, it turns to engineering to solve problems. Reduce each decision to a simple logic problem. Remove all subjectivity and just look at the data. Data in your favor? Ok, launch it. Data shows negative effects? Back to the drawing board. And that data eventually becomes a crutch for every decision, paralyzing the company and preventing it from making any daring design decisions.

Yes, it’s true that a team at Google couldn’t decide between two blues, so they’re testing 41 shades between each blue to see which one performs better. I had a recent debate over whether a border should be 3, 4 or 5 pixels wide, and was asked to prove my case. I can’t operate in an environment like that. I’ve grown tired of debating such minuscule design decisions. There are more exciting design problems in this world to tackle.

I can’t fault Google for this reliance on data. And I can’t exactly point to financial failure or a shrinking number of users to prove it has done anything wrong. Billions of shareholder dollars are at stake. The company has millions of users around the world to please. That’s no easy task. Google has momentum, and its leadership found a path that works very well.

One thing I love about building web-based software is that there is the unique ability to try out different designs and test them in front of thousands to millions of users without incurring a massive cost. Experimentation practices such as A/B testing and Multivariate testing enable web designers to measure the impact of their designs on the usability of a site on actual users instead of having to resort to theoretical arguments about the quality of the design or waiting until after they've shipped to find out the new design is a mistake.

Experimentation is most useful when you have a clear goal or workflow the design is trying to achieve and you are worried that a design change may impact that goal. A great example of this is how shopping cart recommendations were shipped at Amazon which is recalled in a great story told by Greg Linden in his post Early Amazon: Shopping cart recommendations excerpted below

The idea of recommending items at checkout is nothing new. Grocery stories put candy and other impulse buys in the checkout lanes. Hardware stores put small tools and gadgets near the register. But here we had an opportunity to personalize impulse buys. It is as if the rack near the checkout lane peered into your grocery cart and magically rearranged the candy based on what you are buying.Health food in your cart? Let's bubble that organic dark chocolate bar to the top of the impulse buys. Steaks and soda? Get those snack-sized potato chip bags up there right away.

I hacked up a prototype. On a test site, I modified the Amazon.com shopping cart page to recommend other items you might enjoy adding to your cart. Looked pretty good to me. I started showing it around.While the reaction was positive, there was some concern. In particular, a marketing senior vice-president was dead set against it. His main objection was that it might distract people away from checking out -- it is true that it is much easier and more common to see customers abandon their cart at the register in online retail -- and he rallied others to his cause.

At this point, I was told I was forbidden to work on this any further. I was told Amazon was not ready to launch this feature. It should have stopped there. Instead, I prepared the feature for an online test. I believed in shopping cart recommendations. I wanted to measure the sales impact. I heard the SVP was angry when he discovered I was pushing out a test. But, even for top executives, it was hard to block a test. Measurement is good. The only good argument against testing would be that the negative impact might be so severe that Amazon couldn't afford it, a difficult claim to make. The test rolled out.

The results were clear. Not only did it win, but the feature won by such a wide margin that not having it live was costing Amazon a noticeable chunk of change. With new urgency, shopping cart recommendations launched.

This is a great example of using data to validate a design change instead of relying on gut feel. However one thing that is often overlooked is that the changes still have to be well-designed. Shopping cart recommendations feature on Amazon is designed in such a way that it doesn't break you out of the checkout flow. See below for a screenshot of the current shopping cart recommendation flow on Amazon

On the above page, it is always very clear how to complete the checkout AND the process of adding an item to the cart is a one click process that keeps you on the same page. Sadly, a lot of sites have tried to implement similar features but often end up causing users to abandon shopping carts because the design encourages users to break their flow as part of the checkout process.

One of the places experimentation falls down is when it is used to measure the impact of aesthetic changes to the site when these changes aren't part of a particular workflow (e.g. changing the company logo). Another problem with experimentation is that it may encourage focusing on metrics that are easily measurable to the detriment of other aspects of the user experience. For example, Google's famous holiday logos were a way to show of the fun, light-hearted aspect of their brand. Doing A/B testing on whether people do more searches with or without the holiday logos on the page would miss the point. Similarly, sometimes even if A/B testing does show that a design impacts particular workflows it sometimes is worth it if the message behind the design benefits the brand. For example, take this story from Valleywag "I'm feeling lucky" button costs Google $110 million per year

Google cofounder Sergey Brin told public radio's Marketplace that around one percent of all Google searches go through the "I'm Feeling Lucky" button. Because the button takes users directly to the top search result, Google doesn't get to show search ads on one percent of all its searches. That costs the company around $110 million in annual revenue, according to Rapt's Tom Chavez. So why does Google keep such a costly button around?

"It's possible to become too dry, too corporate, too much about making money. I think what's delightful about 'I'm Feeling Lucky' is that it reminds you there are real people here," Google exec Marissa Mayer explained

~~~

Last night, I stumbled on a design change in Twitter that I suspect wouldn't have been rolled out if it had gone through experimentation first. On the Twitter blog, Biz Stone writes Replies Are Now Mentions

We're updating the Replies feature and referring to it instead as Mentions. In your Twitter sidebar you'll now see your own @username tab. When you click that tab, you'll see a list of all tweets referencing your account with the @username convention anywhere in the tweet—instead of only at the beginning which is how it used to work. So for me it would be all mentions of @biz. For developers, this update will also be included in the APIs.

Compare the above sidebar with the previous one below and which do you think will be more intuitive for new users to understand?

This would be a great candidate to test because the metric is straightforward; compare clicks on the replies tab by new users using the old version as the control and the new version as the test. Then again, maybe they did A/B test it which is why the "@username" text is used instead of "Mentions" which is even more unintuitive. :)

Note Now Playing: Jamie Foxx - Blame It (remix) (feat. T-Pain & Rosco) Note

Categories: Web Development

March 24, 2009

@ 01:07 PM

Comments [10]

Sharing social activity streams across the Web: How Gnip fits in

Over the past two weeks I participated in panels at both the SXSW and MIX 09 on the growing trend of provide streams of user activities on social sites and aggregating these activities from multiple services into a single experience. Aggregating activities from multiple sites into a single service for the purpose of creating a activity stream is fairly commonplace today and was popularized by Friendfeed. This functionality now exists on many social networking sites and related services including Facebook, Yahoo! Profile and the Windows Live Profile.

In general, the model is to receive or retrieve user updates from a social media site like Flickr and make these updates available on the user's profile on the target social network and share it with the user's friends via an activity stream (or news feed) on the site. The diagram below attempts to capture this many-to-many relationship as it occurs today using some well known services as examples.

The bidirectional arrows are meant to indicate that the relationship can be push-based where the content-based social media site notifies the target social network of new updates from the user or pull-based where the social network polls the site on a regular basis seeking new updates from the target user.

There are two problems that sites have to deal with in this model

Content sites like Flickr have to either deal with being polled unnecessarily millions of times a day by social networks seeking photo updates from their users. There is the money quote from last year that FriendFeed polled Flickr 2.7 million times a day to retrieve a total of less than 7,000 updates. Even if they move to a publish-subscribe model it would mean not only having to track which users are of interest to which social network but also targeting APIs on different social networks that are radically different (aka the beautiful f-ing snowflake API problem).
Social aggregation services like Friendfeed and Windows Live have to target dozens of sites each with a different APIs or schemas. Even in the case where the content sites support RSS or Atom, they often use radically different schemas for representing the same data.

The approach I've been advocating along with others in the industry is that we need to adopt standards for activity streams in a way that reduces the complexity of this many-to-many conversation that is currently going on between social sites.

While I was at SXSW, I met one of the folks from Gnip who is advocating an alternate approach. He argued that even with activity stream standards we've only addressed part of the problem. Such standards may mean that FriendFeed gets to reuse their Flickr code to poll Smugmug with little to no changes but it doesn't change the fact that they poll these sites millions of times a day to get a few thousand updates.

Gnip has built a model where content sites publish updates to Gnip and then social networking sites can then choose to either poll Gnip or receive updates from Gnip when the update matches one of the rules they have created (e.g. notify us if you get a digg vote from Carnage4Life). The following diagram captures how Gnip works.

The benefit of this model to content sites like Flickr is that they no longer have to worry about being polled millions of times a day by social aggregation services. The benefit to social networking sites is that they now get a consistent format for data from the social media sites they care about and can choose to either pull the data or have it pushed to them.

The main problem I see with this model is that it sets Gnip up to be this central point of failure and I'd personally rather deal interact directly with the content services directly instead of inject a middle man into the process. However I can see how their approach would be attractive to many sites who might be buckling under the load of being constantly polled and to social aggregation sites that are tired of hand coding adapters for each new social media sites they want to integrate with.

What do you think of Gnip's service and the problem space in general?

Note Now Playing: Eamon – F**k It (I Don't Want You Back) Note

Categories: Startup Shoutout | Syndication Technology

March 22, 2009

@ 11:49 PM

Comments [1]

Video: Standards for Aggregating Activity Feeds and Social Aggregation Services at MIX '09 Conference

On Thursday the 19^th of March there was a panel on activity feeds like you find on Twitter & Facebook and social aggregation services like Friendfeed. The panelists were Kevin Marks (@kevinmarks) of Google, John McCrea (@johnmccrea) of Plaxo, Monica Keller (@ciberch) of MySpace, Luke Shepard (@lukeshepard) of Facebook and Marc Canter (@marccanter4real) of Broadband Mechanics. Yours truly was the moderator.

Although the turnout wasn't massive given it wasn't the run of the mill content for MIX 09, the audience was very engaged and we had almost 45 minutes of Q&A until we ran out of time. You can find the video here and also embedded below if you have Silverlight installed.

Note Now Playing: Jodeci - My Heart Belongs to You Note

Categories: Social Software | Trip Report

March 21, 2009

@ 04:32 PM

Comments [17]

Facebook "stream" redesign: Disruptive companies don't listen to their customers – Mark Zuckerburg

Facebook's latest redesign which has been clearly inspired Twitter's real-time stream of status updates has had a ton of detractors from all corners. One of the biggest places where the outcry has centralized is the Facebook Layout vote application which currently has had over a million votes from Face book users with over 94% against the new changes and almost 600,000 comments, most of which seem to be negative if the hundred or so I read were a representative sample.

One thing I've wondered is how the folks at Facebook are taking this feedback. On the one hand, people don't like changes and the more disruptive the change the more they fight it. It's almost comical to go back and read the Time magazine article about the backlash against the news feed from back in 2006 given how much the feature has not only ended up defining Facebook but how significantly it has impacted the social software landscape at large. On the other hand, sometimes people have a good reason to protest such as the outcry against the privacy destroying Facebook Beacon which eventually inspired a mea culpa from Zuckerburg himself.

Owen Thomas from Valleywag has an article entitled Even Facebook Employees Hate the Redesign which contains the following excerpt

The feedback on Facebook's new look, which emphasizes a stream of Twitter-like status updates, is almost universally, howlingly negative. Why isn't CEO Mark Zuckerberg listening to users? Because he doesn't have to, he's told employees.

A tipster tells us that Zuckerberg sent an email to Facebook staff reacting to criticism of the changes: "He said something like 'the most disruptive companies don't listen to their customers.'" Another tipster who has seen the email says Zuckerberg implied that companies were "stupid" for "listening to their customers." The anti-customer diktat has many Facebook employees up in arms, we hear.

When your application becomes an integral part of your customers lives and identities, it is almost expected that they protest any major changes to the user experience. The problem is that you may eventually become jaded about negative feedback because you assume that for the most part the protests are simply people's natural resistance to change.

I tend to agree that disruptive companies don't listen to their customers in the narrow sense of using them as a barometer to decide what products or features to build. Customer feature requests aren't the source of input that would spawn a Netflix in a world that had Blockbuster & Hollywood video. Such disruptive products are spawned from understanding the customers better than they understand themselves. If you had simply "listened" to Blockbuster's customers you'd think the best way to compete with them would be to have cheaper late fees or a bigger selection in your store. Netflix actually went a step further and understood the underlying customer problems (e.g. even going to a video store is a hassle which is why you end up with late fees in the first place) and created a product that was truly disruptive.

Using that model of "disruptive companies" the question then is whether the new Facebook is an example of understanding your customers better than they understand themselves or is truly a mistake? For my take on the answer I'll first point out a comment on Valleywag on the redesign

Here's the problem with the redesign. Twitter is a micro-blog. The 140 character Livejournal.

Facebook is not a blog. In its old form it was a really great PHONEBOOK. A phonebook that not only updated your acquaintance's (most FB friends are not friends) contact info, but also gave you a general summary of their life. It was a big picture kind of thing: Where they are, who they're dating, what school or job they have, and how to contact them. It was never about "sharing" your daily thoughts on how great your panini was or omg gossip girl is back! The livejournal twit-blog crap is messing up the phonebook interface.

This is the crux of the problem with the Facebook redesign. The expectations around how user relationships were created on Twitter are totally different from how they were on Facebook. On Twitter, users explicitly decide as part of following someone that they want all of the person's tweets in their stream. In fact, this is the only feature of the relationship on Twitter. On Facebook, you have relationships with people that attempt to mirror your real life so you have your boss, coworkers, school friends and acquaintances all trying to be part of your social graph because FB is really a kind of "rolodex" in the sky.

The fact that you got a news feed was kind of a side effect of filling out your virtual rolodex but it was cool because you got the highlights of what were going on in the lives of your friends and family. There is a legitimate problem that you weren't getting the full gist of everything your 120 contacts (average number of Facebook friends) were doing online but it would clearly lead to information overload to get up to the minute updates about the breakfast habits of some guy who sat next to you in middle school.

Somewhere along the line, it seems the folks at Facebook didn't internalize this fundamental difference in the social context that differentiates user to user relationships on Twitter versus Facebook. This to me is a big mistake.

Note Now Playing: Goodie Mob - They Don't Dance No Mo' Note