Tagging.
It's the "new" thing. Numerous blogs and companies are turning to the concept of allowing users to tag pictures, music, files, etc. with a descriptive component. It's very similar to the Semantic Web, where web page authors can tag their pages with useful meta-data.
Tagging works great on my home music and photo collection.
Tagging will not work well for the web.
Think back to the beginnings of the web. There were no search engines. I remember having to buy the Whole Internet Catalog to get a listing of web pages.
Then Yahoo! came along with a hand-organized hierarchy of web pages. Everyone moved to Yahoo!. Then Altavista came along with an automated keyword matcher, no manual sorting of the web was necessary. Everyone moved to Altavista. (Feel free to continue this exercise with Inktomi and Google).
The problem with manual organization of any data is that it simply doesn't scale. We're creating data far too fast to have any kind of meaningful tags be put in place manually.
Here's another slice of history. When was Google the best? When could I go to Google and actually get the answer I was looking for in a few seconds? Your results may vary, but it was in late 1999 and maybe 2000 for me. Google had hit the sweet spot between the size of their data and their algorithm for tagging the data. I could search for Linux patches and not get 1000 different mirror sites. I didn't have to worry about blogs throwing off page rank.
Even Google, who uses an automated ranking algorithm started to exhibit problems around indexing a billion pages (mid-2000). Manual tagging will start to have problems at a much lower data set size.
One of the big issues is disambiguation. A classic example is "China". If I search Flickr for China I can get 16 pages in before I give up on searching for a nice teapot and try the related search for Porcelain.
And what about stupid or even malicious tagging? Try a search for Tea. Why are hibiscusroto's photos the majority of photos I see? And this is only with a data set of 3.5 million. Danny Ayers posted about this recently. A picture of a toilet has been tagged "blog." Malicious? Stupid? You decide.
This is the inherent flaw in relying upon user tagging, it simply doesn't scale well. Flickr does have a reason for going with tagging -- it's hard to come up with an automated algorithm to operate on images -- automatic image recognition and classification is a hard problem.
However, the only method that scales to the web is an automated system of classification and organization.
And, yes, that's one of the things we're doing at kozoru. :-)
Nod to Jaron.
(Resurrected via archive.org)