Of Tag Searches and Extraction

Lately, I've been mulling over the idea of making Internet discussion topic-oriented again.

My interest in this is primarily self-serving: my quality of writing, abilities to articulate myself, and interests in participating in online discussion have noticeably dipped since the rise of social media. Like any trained skill, this dwindling is the product of atrophy: most of my current time is spent on more immediate and people-focused forms of social media. Twitter in particular discourages long-form, longly-thought prose, and I've been feeling the strain of constraining my thoughts into 140 characters more and more as I try desperately to be less terse post-personal-depression.

The trouble is we now have a blogosphere that is, for the most part, very diffuse, disorganized, and disconnected. Writing posts on a personal blog, like I am doing here, simultaneously feels like I'm yelling into the void hoping desperately to be noticed and, when I am noticed, distracting people from other, more meaningful pursuits of their time. It's lossy, precisely because until you read the words that I'm typing here, it's difficult to determine what I'm about to go on about.

This is the inherent problem with people-oriented social media. While it connects us with a wider and richer audience of people, it also carries the expectation that everything those people say, or at least a reasonable subset chosen at random, will be read by all participants that follow them. This puts the burden of topic discovery with the reader, as they try to determine, for each post in their social stream, whether the content is meaningful for them.

This is a bad paradigm. People are very bad at being spontaneously consistent, or failing that, spontaneously supportive of the expectations of their audience. Indeed, it is a rare blogger on social media who focuses solely on the content of their work or interests of their audience, instead of cathartic spontaneity or the topical profusion and profundity of a Twitter shitter. And when you do find a focused author, chances are they'd really like to sell you something.*

This seems wrong to me. While it gives us a wide array of topics, discussion, voices, and interests, each conversation is sorely lacking for organization, structure, and any form of coherency. The purveying social expectation is also that these discussions are immediate, transitory, and prone to loss if they aren't picked up on near the time of posting. This leads to sort of an echo chamber effect, as people constantly rehash and rearticulate the same basic concepts and immediate structure for a relatively small number of interested participants, instead of moving forward and relying on the support of a topic, idea, or other nexus of research to support their ideas and opinions.**

In short, these posts don't tell an especially good story. They tell an immediate, transient one, indistinguishable from a sound bite in quality and effective longevity. On the posting side, it feels pithy, immediate, and meaningful to capture these ideas close to their original inception point. But, the structure to make these bites form part of a broader social tapestry just isn't there, leaving the burden on the reader to figure out what the hell is going on.

So, as an exercise in intellectual curiosity, I've decided to explore this a bit to see if I could do better. My thoughts soon settled on topic-orientation, precisely because it provides a focus and an implied, shared context for each piece of media. This provides a good story: it elevates the visibility of topics within their space, provides room for them to grow, and ceases to shackle them to each individual storyteller. This allows for a broader, pre-existing, shared context in discussion that is once again larger than a single individual.

The closest technical area of research I can find to re-topicizing discussion is tag search and term extraction. In which I ask an open question: are there any good, multi-social-platform clients that perform tag search and, as a bonus, a simplified form of term extraction? If not, I have half a mind to write one myself using existing APIs and tools, if only to have access to such a tool myself.***

In the meantime, I am experimenting with this using Tumblr. They already support tag search and content extraction (but not summarization) using their API, which is as good a start as any.

* Not that I discourage prospective authors, creatives, and other interests from attempting to sell their wares on social media! It just seems wrong to me that these interests form the majority of what I consider to be focused voices on social media, given the original intentions of the medium.

** One of my roommates wrote a fairly good post that articulates this better than I do here. You can read it at http://kistaro.dreamwidth.org/487228.html.

*** The idea of a company like Google supporting topic-oriented social search, a Google Meta if you will, pleases me. This is more or less the current public direction of their company, so I suspect there are many similar things cooking under the hood that I've simply not heard of.

