booktwo.org

The idiocy of lazy categorisation

August 30, 2007

I was quite interested when I heard about StoryCode.co.uk (via Zero Influence – there’s a .com version too). At first sight, I thought it might be a newer, better version of WhichBook.net: a way of classifying books to create a more accurate “If you liked this, you’ll love…” recommendations system. The advantage it has on WhichBook is to encourage visitors to “code” books they’ve read, which are then added to the system along with the data – a great advance on using professionals behind the scenes to classify books, which has only managed a couple of hundred titles in several years for WhichBook, and is all very good and user-generated and modern.

That’s as far as it goes, however, because instead of allowing users any flexibility in how they describe the book, all literary opinion is forced onto a selection of 50 or so sliders, which veer from the confusing, to the pointless, to the incompetent. Confusing example: “Is the story mostly aimed at a mainstream audience or a literary audience?” (Mainstream → Literary) excludes half of the fiction I read. Pointless example: “How much did the atmosphere of the story feel like one you could experience in everyday life or is it more exotic or surreal?” (Everyday → Exotic) might work for actually surreal books, but for most novels, the answer depends on the reader, not the story—but the program won’t know this. Incompetent example: the ‘Plot Type’ category. To what extent is the book a “Rags to Riches” story (None → Plenty), a “Pact with the Devil” story (None → Plenty), a “Brain Vs Brawn” story (None → Plenty)—these aren’t sliding scales, they’re either/or. The results are meaningless. Try it yourself, and see if the results aren’t suspiciously vague (short version: if you put in a thriller, you get a broad selection of thrillers out).

This annoys me because something so long planned – ten years in the making – with a variety of book and web industry heavyweights behind it (see the About page) really should be better than this. It annoys me that something with such a strong and correct rallying call—”inspired by the belief that the Book Trade, locally and globally, is failing it’s readers in the search for new stories and that the power of the internet and the passion of book lovers everywhere can combine into a unique service”—should result in something so poorly thought out.

Digital recommendation systems have come a long way in the last few years, and there are a number of really important lessons which have been completely ignored by StoryCode. The one they get right is user-generation, but they’ve failed to see that for user data to be valuable, it has to be ambient: Last.fm and LibraryThing don’t ask you a bunch of equivocal questions that are highly dependent on the individuals situation and whim: they just see what you’re into, and run with it. It’s powerful, and it works (why do you think CBS bought Last.fm for a small fortune, or Abe Books bought a chunk of LT—particularly when Amazon’s recommendation system is so rubbish).

The second and equally important lesson is that individuals describe things in many different ways—so let them. Tagging, while rapidly becoming a web cliché, works because it is the most flexible system possible, generating reams of long tail classification data that is individually specific but universally applicable. Tagging also provides another incredibly important feature that StoryCode has missed: an incentive to participate. Through tags, individuals handcode their own dataset; my delicious tags for example, allow me to find almost any half-remembered link I’ve ever saved with a couple of terms which are meaningful to me. LibraryThing’s Tag Mirror reveals real things about your reading habits. With StoryCode, there’s no long-term incentive to participate, and mass use is what drives these systems.

So, another book industry initiative fails to learn some of the basic lessons of the internet. We shouldn’t be surprised, but how much time and effort is being wasted here?

Oh yeah, and it doesn’t validate. I’m going to start some kind of button system for lit sites that don’t use web standards. We’re all about standards in literature, spelling, punctuation, typography – we have to get the code right too.

Comments are closed. Feel free to email if you have something to say, or leave a trackback from your own site.

You have been reading booktwo.org, the blog of James Bridle: art, literature, and the network, since 2006. Follow the RSS Feed for new posts.