Narcislaan 24, 5551AR, Valkenswaard, The Netherlands

Taxonomies and other Jargon

A fruit baring explanation about apples and pears

This blog is about taxonomies but besides the important part of knowing what the word taxonomy means it is equally important to know what it is not. I present a collection of concepts from the same jargon. Terms that are sometimes used in IT.

  • Taxonomy
  • Typology
  • Folksonomy
  • Thesaurus
  • Ontology
  • Canonical model

No common IT discipline; involve professionals.

Obviously it is very ambitious to create models that are accepted by all as credible. Since people have different views on the same issues. Each in its own perception. If the need is there, one model can form a good foundation for information architecture and has many advantages in IT architecture that are once interfaces now. That, of course, is great, but how do you develop a model where everyone in a company it is agreed? This is not easy.

Assemble a multi disciplinary team. Not only technicians.

If an organization is committed to instruments as described above it is important to consult experts such as:

  • Library scientists
  • linguists
  • Information technology expert with a penchant for the conceptual field of classification is preferred to professionals who have moved to the manufacturing side of the IT (e.g. developers).

The application of these techniques requests subject matter experts from within the organization. In this manner, one can set up the first version and present them as a starting point. Do not go all together to argue otherwise it never comes to anything. Accept expertise as you also accept the authority of example, an oncologist. Sometimes you have to prevent something going proliferate to keep it valuable.

What about SharePoint?

term store

Taxonomy, thesaurus and are directly reflected in SharePoint. The term store is a tool that term sets (taxonomies) are housed and where thesaurus functionality is possible. The setup of SharePoint offers a central term store at the farm level or tenant. The site automatically inherit collections that are within the scope of this term store the terms of this central facility. Before you go up some term set or a series of term sets is wise to have purchased more than overnight to go over it.
Think first and then act.
The transfer of taxa from the specific (local term store) to a generic (or tenant farm level) is difficult.

Central (generic) or local (specific)

Each site collection contains a term store. And at each collection site, it is possible to make specific term sets. Realize the same time that specific term sets without customization are not accessible from other site collections.

Try as much as possible to work from a central but skip the dynamics do not try immediately to arrange everything. A useful tool to bridge the setting option on the library’s “Enterprise Metadata and Keywords Settings” thus an opportunity is created to grant unsorted metadata. The results are shown in the term store and thereby provide a source to maintain the managed/graded metadata.

Taxonomy

A word that originates from Greek. A combination of (taxa) concepts like ordering, arrangement along with (nómos) words like use, rules, and law The science of arranging individuals or objects into groups (taxa, or the single term taxon).
The term taxonomy can be used for both the method of arranging concepts as for the hierarchical ordering that is the result of the process. Such a hierarchical structure or ordering and the activity to get to such an ordering is called classification. Almost everything can be organized or structured in a taxonomy: life and living organisms, tools, goods, all kinds of things, books, topography, administrative structures, events, etc.

Taxonomy in technology

In computer science, the need arises for more and more common terminology to be used in systems and databases, including for the purpose of the integration of data from various systems and for the unique exchange of product data, such as e-business systems and knowledge-driven designs. To enable this, use is made of standardized definitions of concepts, where the terms are arranged in a subtype-supertype hierarchy or taxonomy. This structure, among another great advantage that properties of super-types are inherited by subtypes.

In recent years, in the fields of computer science and artificial intelligence, attempts are made to create and maintain taxonomy from a set of concepts. An example is the automatic classification of a group of documents, for example, digital libraries. It is remarkable that in this field, a distinction is made between taxonomy and typology. The difference is mainly in the way in which the classification is established. In a taxonomy, you arrange a group of sample objects by dividing them. The next step is to observe what characteristics a concept has and you place it in a hierarchy by use of overarching features. This process shapes the taxonomy.

In a typology, one starts from the concept. One considers that distinctive characteristics might normally have any objects, and then proceeds to classify the actual objects in accordance with these rules. One could say that taxonomies empirical (inductive) are established, and conceptual typologies (deductive).

Backgrounds from other sources

Thesaurus

In the classical sense, a thesaurus is a kind of reference. A thesaurus is used to find the exact word for an object, a certain technical term or a word with the desired connotation (style considerations).

In modern times it is a tool through which unique concepts are linked by hierarchical equivalent and associative relationships. The term comes from the Greek and means treasure. It was initially established in linguistics as a logical-systematic (and alphabetically, but not explanatory) dictionaries: the concepts of language were categorized and compared to related concepts:

  • Synonyms; words that have a similar meaning. Sometimes people use the term data dictionary as a synonym for thesaurus
  • Hypernyms; words that describe a broader concept. Lexicon has a wider meaning than a thesaurus.
  • Hyponyms; words that have a narrower meaning. Synonyms list has a narrower meaning than a thesaurus,
  • Antonyms; words with the opposite meaning.

The term “thesaurus” is also used for a reference book with a specialized vocabulary within a particular interest- or professions, such as medicine or music. With the aid of a thesaurus the catalog of a library, for example, makes it more accessible than by means of an arrangement, which in the end is arbitrary.

For categorizing and reference one is not strictly bound by the terms (and the language) of a book or other media such as video or sound that contains no text or metadata.

A thesaurus can even assign multiple terms per publication or item of information.

Folksonomy

A folksonomy is a system in which users apply public tags to online items, typically to aid them in re-finding those items. This practice is also known as collaborative/ social tagging, social classification or social indexing.

Folksonomy (when it was “invented”) was original “the result of personal free tagging of information for one’s own retrieval. The borderline between folksonomy and social tagging (tags in an open online environment where the tags of other users are available to others) is becoming vague. Folksonomy is commonly used in cooperative and collaborative projects such as research, content repositories, and social bookmarking.

The term folksonomy is a mix of the words folk and taxonomy.

If you define taxonomy as a way of managed metadata folksonomy is the opposite it is just a container of terms with no order but if you can derive the use of each term you can find meaningful terms for an organization and if you monitor the folksonomy you can promote words to the taxonomies.

Examples:

  • Twitter hashtags
  • Instagram
  • WordPress

In many features like Blue Kiwi, SharePoint, etc. folksonomies can be presented in tag clouds. Showing significance and use by size of the display.

A tagcloud is a possible result of using folksonomy but it will show the tags that you used.

The example below is the result of the editors tagging on content. Imagine that you work on a platform like an intranet and there are newsarticels that you want to collect based on your own context and definitions, than your tags will be displayes in a tag cloud of a kind. Sometimes it will display the terms that are used more often in a bigger font that those terms that are used less often.

Typology

A typology (in general) is a subdivision of a group of persons, descriptions, objects based on a number of characteristics. E.g. The Dutch cities can be divided by province or county (like cities in Limburg, Holland or cities in Noord Brabant…) according to population. Cities with over 500.000 inhabitants, cities with a population of 250.000 – 500.000 or other combinations.
Most groups of objects can be classified in many ways. Some typologies, however, are considered better than others. A typology with empty categories (e.g. cities in Limburg with more than 500.000 inhabitants) can be considered a weak typology. On the other hand, too many objects in a category also provide in a poor typology.

The terms, typology, classification system and taxonomy can be considered synonymous. In the domains of psychology, computer science/ artificial intelligence the distinction between these terms is made. The difference is to be found in the way they are created; taxonomy (empirical) or typology (conceptual).

It is possible that concepts that are related in a typology have no relation in a taxonomy. Let’s say if you define a typology of things you take along as a gift for a visit of a sick colleague than you expect to find concepts as apples, pears, flowers, and crossword puzzle magazines.

It’s not likely that you find those concepts combined in a taxonomy.

Ontology

In computer science and logic, an ontology is the result of an attempt to define a complete and strictly conceptual scheme on a certain topic or domain. The word ontology is a term used in philosophy.

An ontology is typically a data structure, describing all relevant entities and their relations within the rules of the domain. In the field of artificial intelligence, the concept of ontology is used to describe the ‘real world’ in a way that a computer can comprehend. Another way to describe it is knowledge representation.

In a semantic web a computer needs to derive the meaning of either text or metadata from a model and based on that information it can calculate reasoning, effect or conclusion.

An ontology is used as a strict and complete model for a certain domain, mostly in a hierarchical structure, containing all relevant units and their relations and the rules that these units and relations need to comply to.

canonical models

A term used in data modeling but which in itself is difficult to provide a definition.

Words that approach the concept

  • Typical
  • Normally, normalized
  • Unique, unambiguous
  • A standardized way of displaying
  • According to acknowledged, accepted rules

It is also an adjective meaning that the subject is in accordance with the canon, the rules (originally ecclesiastical laws). Canonical issues are so believable, and so is a canonical model.

canonical used in information architecture

Information Architects often talk about canonical models that reality divides into concepts and relationships. A model makes reality visible. A canonical model is a clear conceptual model designed based on a standardized and common approach to something in a particular context (a piece of reality) with the result.

  • Clarity
  • Standardization
  • Common look
  • Context

A canonical model is unambiguous and therefore only explains one way. The meanings of the concepts in the model are based on a commonly agreed standard. Think of a typical description of a car. A car is a very complex thing, but following the model of “car” is quite universal.

The model brings the complexity of the car back to some key concepts related to each other. A typical car has a body, an engine, a steering wheel, a front axle with two wheels and a rear axle with two wheels. The steering wheel is connected with the front axle, and the motor drives one of the shafts or both at the same time. This model typifies a car. Every car meets this model. Indeed, tricycles do not, so the model is not universal, but within the context of a car maker which produces only four-wheeler vehicles.

A canonical model simplifies communication about things in a particular context (eg a company). Anyone within that context that the model does know what is meant when the concepts are discussed in this model. It prevents, said quite simple misunderstandings. The model is, after all, unequivocally.

Leave a comment

You must be logged in to post a comment.
Privacy Preferences
When you visit our website, it may store information through your browser from specific services, usually in form of cookies. Here you can change your privacy preferences. Please note that blocking some types of cookies may impact your experience on our website and the services we offer.