12 Nov 2011

How Hashtagging the Web Could Improve Our Collective Intelligence




Laura Larsell is the information ontologist at Trapit, a content discovery, personalization and curation platform currently in beta. Laura holds an M.A. in library sciences from the University of Texas at Austin.
I rolled my eyes when the Library of Congress announced in the spring of 2010 that it would be archiving Twitter. Great, I thought — drunken tweets about burritos preserved for eternal posterity.
But the Library of Congress, as it turns out, was more forward thinking than I could have imagined. Twitter data, presumed to represent the pulse — and sometimes the future — of popular consciousness, now commands big bucks from hedge funds that in turn use Twitter data to make investment predictions. Even scientists are tapping into Twitter data for research purposes.
Why all the fuss over tweets? Twitter hosts valuable, communal conversation in real-time. And Twitter trends become more powerful the more users contribute to the dialogue. Finally, Twitter allows the chatter of millions to be parsed into channels (hashtags) of real-time conversation that covers widely varying topics. Jokes, rumors, political movements, pop culture fanaticisms, the collective screaming of teenagers — they all bubble to the surface and shift and change like an oil slick, much like a collective human consciousness.
While Twitter generates mass interest and curates collective thought, until usage stats rise significantly, its trends cannot represent the true pulse of world conversation. That needs to change.

Applying Twitter Logic to the Web


Twitter captures an admittedly small slice of the collective world consciousness — in the U.S. only 78.2% of households have access to the Internet, and only 13% of online Americans actively use Twitter. It is also a platform as much about stats and bot spammers as it is about honest conversation.
Despite the clutter, Twitter continues to generate an abundance of sociologically interesting data every day. Researchers from Cornell University recently used Twitter data to look for and examine trends in mood over time. They determined that collective mood patterns fluctuate in predictable ways over the course of the day and year. While this conclusion may seem obvious, before Twitter came along, documenting this type of pattern would have required a massive survey and multiple studies.
One thing that makes Twitter so powerful is its use of a standard language: hashtags. Any hashtagged tweet is automatically linked to every other tweet that shares the same tag. This allows for consistent dialogue and measurement.
However, the Internet as a whole is not a very consistent medium. Patterns emerge in specific areas of the web, but no uniform underlying structure exists to merge these patterns. Content may go viral or score a high page rank, but it doesn’t easily connect to related topics or encourage a larger conversation. It is a frustrating vestige of print culture that my web curation should be limited by my search ability.
Furthermore, what happens to long form digital conversation in the era of Twitter? Consider especially that long form conversations include more invested and potentially expert perspectives. These perspectives are different from the collective consciousness, and yet, are not easily parsed into mainstream channels.

The Watermelon Story





A big part of what I do every day is train an algorithm to tag documents in the manner of my choosing. The software is in beta, and is presently only culling from a selection of web content, but it does pretty well with simple concepts.
Early on, I set the machine to find content relevant to the subject/tag “watermelon.” It’s a limited data set, but this is what I’ve found so far: People write about watermelons consistently throughout the summer, most frequently in mid-summer. Again, this may seem an obvious conclusion, but proving it would have taken an incredible amount of time and effort on my part.
So what do people write about watermelons online? Recipes involving or featuring watermelons are by far the most popular watermelon content, and the most popular serving suggestions feature various kinds of boozy drinks and popsicles. The second most popular posts are how-tos that guide readers through the watermelon selection process (knock on it, listen for the right sound).
More niche discussions about watermelon include analyses of racial stereotypes, a story about Palestinian prisoners’ daily fruit allowance, and a report on a new variety of cold weather watermelon grown in Turkey.
What conclusions can we draw from this sampling of watermelon content? Over time I’ll be able to draw quantitative conclusions about the state of watermelon journalism on the web. Watermelon may not be aggregated often (there are no watermelon sections of the newspaper), however, the ability to easily track more important ideas involving watermelon (like racial stereotypes) over time could prove illuminating.

Content Organization = Collective Knowledge


Twitter can gather direct, mass conversation into subject categories like #watermelon, but the conversation is limited by the short form nature of the platform. If longer form methods of online communication could be aggregated into a similar form of direct conversation, it would serve both spectators and authors alike.
For that to happen, citation must be standardized. Current citation methods like hashtags are rarely, if ever, exhaustive, and they often take on the subjective viewpoint of the author or sharer.
Imagine the level of constructive debate and creativity that we might achieve when we organize and bucket all web content into Twitter-like categories. Imagine the kinds of things we might learn about our collective culture.


by at Mashable

Filled Under:

1 comments:

  1. Really riveting piece, I never knew that the collective " brain farts" from one site could hide so much un-tapped data. I never realized how useful a tool twitter can be.

    ReplyDelete