stmllr.net

Thoughts about classifying objects in a digitalized world

by on stmllr.net

While reading for my thesis, I found some interesting articles about order and hierarchies in the web. One central issue in a digitalized world is the question of how to classify objects.

Before thinking about classification in a digital environment, it might be useful to take a look at the good old analog world. What's the difference between classification of objects in an analog and a digital world? When we face a large quantity of objects, we tend to order them to keep track of the sheer complexity. One way to deal with that, is to categorize things in a hierarchical order.

For example, a friend of mine is collecting music records. He needs to order the records in his shelf to be able to quickly find a particular album. He chose the following hierarchical order:

 1. Genre
2. Musician/Band name
3. Album name

This kind of order is sufficient to find most of his albums. But from time to time, he has to face the situation to be wary about which genre to choose for a new album. Dub, Reggae or Ska? The circumstance to decide in terms of either/or pushes him into the uncomfortable situation of making more or less poor decisions.

Now think about digitalizing all his records. The usual way to structure the digital copies is to save each song as a single file on the harddisk. These files are being stored inside directories, which represent the band name and/or the album name:

/Music/Bandname/Albumname/Songname.ogg

What pops up here is the habit to maintain structures of the analog world when transforming content into a digital form: A song belongs to a record and the record belongs to a band. This hierarchical paradigm seems to be widespread on desktop computers, although it is not mandatory. One can just as well use a database instead of a filesystem. In case of finding a certain song the represented file structure is not a limitation, because one can add any meta information to a song he likes. For example, a genre can be assigned to a song by using an id3 tag. Based on this meta data, a music library software can crawl all directories and list all Reggae songs. We even could assign more than a single genre, just like this blog article is tagged with multiple tags. No more decision in terms of either/or, but in terms of as-well-as. That means more freedom in classification and less misdeterminations caused by enforced yes-or-no decisions. In contrast to the analog world, digital content can be separated from its mode of presentation. In combination with networking, this opens a new kind of freedom when it comes to classify content. In particular, collaborate classification as realized in so called folksonomies seems to be very helpful and popular.

The other side of the coin is an increasing diffusion which may lead to disorientation. Human beings are heavily bound to the rules of the analog world and used to its limitations. Simple models of classification help to avoid uncertainty, even when they tend to cause wrong decisions. The challenge in a digitalized world is to make use of collaborate forces against the increasing confusion in a complex world. Folksonomies as a classification model seem to be a helpful step towards that. There might come a day when we have to skip using hierarchical ordered files and folders, because this kind of categorization gets too complex for handling the rapidly growing data.

Tags

Comments

  1. Steffen

    If you are interested in some more details, have a look at the bachelor thesis about 'File Systems and Usability - the Missing Link' by Robert Freund

    He also provides a proof-of-concept called NHFS (nonhierarchical file system). It demonstrates how such a file system could work and is implemented in user space with FUSE on Linux.

    Thanks to Ingo Frost for pointing this out.