ISKO UK - Taxonomist label thyself

Home
Blog
Taxonomist label thyself

Taxonomist label thyself

6 Jan 2022 4:09 AM | Judi Vernau

I usually call myself an information architect, but at other times I’m a taxonomist or even an ontologist. What’s in a name?

As far as I know, it was Richard Saul Wurman who coined the term ‘information architect’ and used it for his book Information Architects (Graphis, 1997). Three years later I went for an interview for a job with a major publisher that required analysing the text and data that they published and suggesting how that could be structured and linked for further exploitation. The person interviewing me knew what he was trying to achieve but he admitted that he wasn’t sure what the role would be called. ‘I do’, I said, ‘You need an information architect.’ So that’s what I became, though I can’t remember how I’d come to know the term. When Wurman came up with it, one of his definitions was ‘A person who creates the structure or map of information which allows others to find their personal paths to knowledge’. The field he was actually referring to was what some of us might call knowledge representation these days, but the idea of structures and maps is very relevant.

Morville & Rosenfeld in the famous ‘polar bear book’ (Information Architecture for the World Wide Web, O’Reilly, 1^st ed 1998) say that ‘structuring, organizing and labelling [are] what information architects do best.’ They’re thinking about it in the context of web sites, and the people who do web design often call themselves information architects. For me, the underlying structures, metadata and vocabularies are part of the information architecture, and the actual layout and functionality is part of user experience (UX) design, a very closely related but different field of expertise.

There again, some say that just the navigation on a web site is the information architecture. Now that, to me, is part of the taxonomy. Oh no, another tricky term! What do we mean by taxonomy? ISO 25964 says it’s ‘A scheme of categories and subcategories that can be used to sort and otherwise organize items of knowledge or information’, which could mean pretty much any kind of categorisation of content. I won’t go into the history of categorisation of information here (try good old Wikipedia https://en.wikipedia.org/wiki/Library_classification for a brief overview): suffice it to say in the 1960s to 1980s there were primarily classification schemes (eg the Dewey Decimal System or the British Catalogue of Music Classification) or thesauri (eg CABI Thesaurus or the UK Department of Health Thesaurus), the classes (for the former) or terms (for the latter) of which were applied to content by specialists, called librarians, cataloguers or information scientists, in order to support information retrieval.

By the 1990s, the ubiquity of office computers and the rise of the world-wide web had started the information explosion, and organisations – at least the smart ones – recognised the need to attempt to corral their knowledge and information to make it findable, usable and manageable. One tool in the armoury was the enterprise taxonomy, and I duly became a Taxonomist. But what did we mean by taxonomy? In an organisational context it meant a vocabulary, or set of vocabularies, that could be used across the business to provide consistent meaning, as well as standard vocabularies for tagging, searching and navigating documents. That’s clearly different from a classification scheme, which generally uses some form of code, perhaps numbers or a combination of numbers and letters, to represent a domain - although you could argue that the ISO definition of taxonomy given above fits that description perfectly. But if classifications are about representational codes, and taxonomies are about terminology, how does a taxonomy differ from a thesaurus? Another time, perhaps!

For the purposes, of this blog, let’s stick with the idea that taxonomy represents the language, labels, vocabularies, within a given domain, whether that’s an organisation, a web site, or a specific set of content. It can manifest as individual alphabetical lists of terms and/or hierarchical structures; its job is to support some or all of the goals of findability, usability and management of content. Yes, a thesaurus can do that too. The key difference is probably that a thesaurus will most likely conform to the standard properties and relationships described in ISO 25964, whereas a taxonomy could be somewhat looser in its relationships, for example in the construction of a navigation tree.

So what about ontology? It seems to have taken over from taxonomy as the semantic tool du jour. Again it’s based on semantics, and very importantly with much more emphasis on the concept itself (which may have many labels) and its relationships to other concepts, as well as on technical standards for representation and querying. I sometimes call myself an ontologist now, and have indeed built formal ontologies for a number of government departments in New Zealand, but is this really just a fancy taxonomy? The World Wide Web Consortium (W3C) helpfully says ‘There is no clear division between what is referred to as “vocabularies” and “ontologies”. The trend is to use the word “ontology” for more complex, and possibly quite formal collection of terms, whereas “vocabulary” is used when such strict formalism is not necessarily used or only in a very loose sense.”’ (https://www.w3.org/standards/semanticweb/ontology.html). So if an organisation thinks it needs an ontology, does it actually want to do the kinds of sophisticated things that such a structure can support (autocategorisation, interoperability, inferencing, knowledge graphs…..) or does it just need a metadata scheme designed to support the appropriate level of granularity of content through all stages of its lifecycle, with supporting controlled vocabularies (or thesaurus? or taxonomy?) and possibly a navigation hierarchy? Or to put it another way, an information architecture?

Those of us working in knowledge organisation care about providing clarity on what terms mean and how they relate to other terms, about controlling language to support findability, and helping to make knowledge discoverable, usable and manageable, but if we can’t be clear on what our own jargon means, it’s definitely not a good look!

Happy New Year!

Add comment