Tagging: Why we use it and where it's going
Tagging is one of the most widely-implemented functionalities in all of web software. It's truly ubiquitous in its ability to solve data linking issues, and knows little boundaries in the type of data it's linking. You see it used in news sites to denote commonality across the subject of articles, and you see it in commercial sites like Amazon to lump together products into different areas. In fact, it's quite difficult to imagine situations where tagging just would make no sense.
There is extremely good reason for the pervasiveness of tags: because we, as humans, have a natural inclination to group things together. My approach to organization is to put stuff into gigantic unorganized piles, and then once the pile tips over, it gets organized into sub-piles. So what used to be "Gigantic pile of potentially important paper" gets subbed out to "Stuff that actually needs to be kept", "Stuff that needs to be kept for a little while longer", "Stuff that can be trashed", "What in the world is this?!", and my favorite pile, "Who's is this, and how did it get here?". What I'm doing here is a very physical and literal translation of tagging. I've decided that all paper is not equal. Some of my paper is just junk, like the receipt from the grocery store, and other paper, like my diploma, has lasting importance.
The awesome part of tagging is that it solves some a limitations of my paper-sorting organizational pattern. My sorting method drew a line in the sand -- a piece of paper has to belong to one category. Say I get a bill from AT&T for my cell phone. That piece of paper is something that needs to be dealt with right away, but it's also a prime candidate for a "Bills" pile. Where does it go? Really, I'm just stuck at this point. But when I tag a document on my computer, or an article on the web, there's nothing that says I can't have 2 tags on it. What we do to organize ourselves in real life is categorization. Something _belongs to_ a category; my diploma *belongs to* the important paper stack, just as a granola bar *belongs to* the breakfast isle in the supermarket. Tagging takes a different, much more elegant philosophical approach to organization. Paper is paper, and granola is granola. But these have absolutely limitless _attributes_. My diploma is principally paper, but has the attributes of being something I should keep, important, and belonging in a frame.
So this covers why tagging is awesome. It's a paradigm that is not really possible in the real world, with it's physical limitation of, you know, my diploma not being able to be in 3 places at the same time. And sure, I could put post-it notes on all my stuff, so that my diploma was marked as "Important", and "It's a Keeper!". But then I'd lose all sorting advantages that I get by categorization! My computer, on the other hand, can sort things very quickly on the fly, and so on the computer, the best case wins, hands down.
But...there's a problem here. Tagging is awesome on the desktop because I can tell my stuff what it's attributes are. I can say that a document that I'm writing is important, work-related, was written in Virginia, and so on and so forth. But what about when I'm on CNN's site? Tags come in 3 principal derivatives: Tags which I define for my own organization, Tags which someone define for my own organization, and Tags which have no ambiguity (that is, I'm calling pizza a food, you'll call pizza a food, and anyone who wants to disagree can post in the comments for my future entertainment). On the desktop, all of these concepts rule the roost. I can tag my stuff, if someone sends me a file, I can accept their unambiguous tags and I can add my own if I want. But all too often on the web, you lose the ability to organize like you like to. If I view an article on CNN, all of the tags are pre-defined, and I can't see a way to organize like I want to.
So, how do we fix this? There's no good answer to that really, not at current. There's no magic wand to fix that issue. The problem I described above is pretty abstract, and most of the time, a website is going to offer fairly broad tags which mitigate this issue entirely. But losing the fine details of tagging is losing a lot of what makes tagging really great. Authoring content and giving something complex 3 or 4 tags is not going to be helpful, and if I'm the only one tagging something, that makes those tags of great use only to me. But beyond all this, on a higher level, what I've been talking about here deals completely about how we classify the contents of what we put on the web, and how we can make these classifications all the more relevant to every viewer of that content. Solving this problem is exciting in the of itself for what it would accomplish in linking data together. It's even more exciting in that by solving a basic tenant of how we relate data inside of a single website, we solve the problem of how terms apply to content across the whole of the world wide web. Essentially, we would break the chains that hold back massive improvements in the quality of search engine results.




Comments
Tagipedia
We need a 'Tagipedia' (Wikipedia for tags): a place where anybody can tag any uniquely identified web resource (URL) and thus gradually, collaboratively describe the entire Web in more semantically sensible way.
Ironically, large social-bookmarking services (e.g. del.icio.us) have enough data to start something like that, but they have not exposed it in a meaningful way, yet. What we really need is an API call that would return top 10 (or 20) most common tags, across all users for a specific URL. Given that all kinds of widgets can be built, including browser-extensions.
Del.icio.us, where art thou? Can you hear us?
Post new comment