Harish Mallipeddi has a blog post entitled Measuring efficiency of tagging with Entropy links to the paper Understanding Navigability of Social Tagging Systems by Ed Chi and Todd Mytkowicz of Xerox Parc which excerpts the key findings from the paper. One result of their research which seems obvious in hindsight and shows one of the issues that social software has to deal with as its community of users grows was
The way he does that is to measure entropy (yup that same old same
old Claude Shannon’s information theory which you learned in one of the
CS courses) of entities like documents (D), users (U) and tags (T). His
research group crawled the entire del.icio.us archive and then
calculated the entropies. Here’s what they found:
• H(D|T) specifies the social navigation efficiency. How
efficient is it for us to specify a set of tags to find a set of
specific documents? We found that in del.icio.us that it is getting less and less efficient.
This makes sense when you think about it. Let's say the first set of users of del.icio.us came from a homogenous software development background and started applying the tag "xml" to mean items about the eXtensible Markup Language. Later on as the community grew, a number of gamers joined the site and they now use the tag "xml" to refer to items about the game X-Men Legends. Now if you are one of the original geek users of the site, the URL http://del.icio.us/tag/xml no longer is just about markup languages but also about video games. To actually find items strictly about the eXtensible Markup Language you may have to add other tags as refinements such as http://del.icio.us/tag/xml+programming.
What this means is that to the oldest users of the site, the quality of the tagging system will seem to degrade over time even though this is a natural consequence of growth and diversifying its user base. Of course, this is only a problem if a lot of people use del.icio.us to find all items about a topic (i.e. browsing by tags) as opposed to just storing their individual bookmarks or subscribing to the bookmarks of people they know and trust.
The "diversity in conferences" recurring debate was kicked off again by a blog post by Jason Kottke entitled Gender Diversity at Web Conferences which encouraged the interesting responses from folks like Eric Meyer, Anil Dash and Shelley Powers. They are all good posts with stuff I agree and disagree with in them but I wasn't moved to write until I read the post Why are smart people still stuck on gender and skin-color blinders? by Tantek Çelik where he wrote
Why is it that gender (and less often race, nay, skin-color, see below)
are the only physical characteristics that lots of otherwise smart
people appear to chime in support for diversity of?
E.g. as long as we are trying for greater diversity in superficial
physical characteristics (superficial because what do such
characteristics have to do with the stated directly relevant criteria
of "technical expertise, speaking skills, professional stature, brand
appropriateness, and marketability" - though perhaps I can see a
tenuous link with "rainbow" marketing), why not ask about other such
characteristics?
Where are all the green-eyed folks?
Where are all the folks with facial tattoos?
Where are all the redheads?
Where are the speakers with non-ear facial piercings?
Surely such speakers would help with "hipness" marketing.
I found this post to be disingenious and wondered how anybody could downplay the gender and racial bias in the "Web 2.0" technology conference scene by equating it to a preference for green eyed speakers. So I decided to throw in my $0.02 on this topic...again.
After the last ETech, I realized I was seeing the same faces and
hearing the same things over and over again. More importantly, I
noticed that the demographics of the speaker lists for these
conferences don't match the software industry as a whole let alone the
users who we are supposed to be building the software for.
There were lots of little bits of ignorance by the speakers and
audience which added up in a way that rubbed me wrong. For example, at
the 2005 Web 2.0 conference
a lot of people were ignorant of Skype except as 'that startup that got
a bunch of money from eBay'. Given that there are a significant amount
of foreigners in the U.S. software industry who use Skype to keep in
touch with folks back home, it was surprising to see so much ignorance
about it at a supposedly leading edge technology conference. The same
thing goes for how suprised people were by how teenagers used the Web and computers. Additionally, there are just as many women using social software such
as photo sharing, instant messaging, social networking, etc as men yet
you rarely see their perspectives presented at any of these
conferences.
When I think of diversity, I expect diversity of perspectives. People's
perspectives are often shaped by their background and experiences. When
you have a conference about an industry which is filled with people of
diverse backgrounds building software for people of diverse
backgrounds, it is a disservice to have the conversation and
perspectives be homogenous. The software industry isn't just young
white males in their mid-20s to mid-30s nor is that the primary
demographic of Web users.
Personally, I've gotten tired of attending conferences where we heard more about technologies and sites that the homogenous demographic of young to middle aged, white, male computer geeks find interesting (e.g. del.icio.us and tagging) and less about what Web users actually use regularly or find interesting (hint: it isn't del.icio.us and it sure as fuck isn't tagging).