Narcislaan 24, 5551AR, Valkenswaard, The Netherlands

Taxonomie en ander jargon

Een fruitige vergelijking over appels en peren.

Deze blog gaat over taxonomieën, maar om te wetenweten wat het woord taxonomie betekent, is het even belangrijk om te weten wat het niet is. Ik presenteer een verzameling concepten uit hetzelfde bakje met jargon. Termen die vaak in IT worden gebruikt.

  • Taxonomie
  • Typologie
  • Folksonomie
  • Thesaurus
  • Lemma
  • Ontologie
  • Canonical model

Taxonomie

Een woord met een oorsprong in de Griekse taal; een combinatie van (taxa) concept als ordenen, gebundeld met (nómos) woorden als gebruik, regels en wetgeving. De techniek van orderenen van individuen en objecten (dingen) in groeperingen (taxa, of de enkelvoudige term taxon.

De term taxonomie kan worden gebruikt voor zowel de methode om concepten te rangschikken als voor de hiërarchische ordening die het resultaat is van het proces. Een dergelijke hiërarchische structuur of ordening en de activiteit om tot een dergelijke ordening te komen, wordt classificatie genoemd. Bijna alles kan worden georganiseerd of gestructureerd in een taxonomie: leven en levende organismen, gereedschappen, goederen, allerlei dingen, boeken, topografie, administratieve structuren, evenementen, enz.

Taxonomie in technologie

In de informatica ontstaat de behoefte aan meer en meer gangbare terminologie in systemen en databases, inclusief voor de integratie van gegevens uit verschillende systemen en voor de unieke uitwisseling van productgegevens, zoals e-business-systemen en kennis-gedreven ontwerpen. Om dit mogelijk te maken, wordt gebruik gemaakt van gestandaardiseerde definities van concepten, waarbij de termen zijn gerangschikt in een subtype-supertype hiërarchie of taxonomie. Deze structuur, naast een ander groot voordeel dat eigenschappen van supertypen worden overgenomen door subtypen.

Op het gebied van informatica en kunstmatige intelligentie worden de laatste jaren pogingen gedaan om taxonomie te creëren en te handhaven vanuit een reeks concepten. Een voorbeeld is de automatische classificatie van een groep documenten, bijvoorbeeld digitale bibliotheken. Het is opmerkelijk dat op dit gebied een onderscheid wordt gemaakt tussen taxonomie en typologie. Het verschil zit vooral in de manier waarop de classificatie tot stand komt. In een taxonomie schikt u een groep voorbeeldobjecten door ze te verdelen. De volgende stap is om na te gaan welke kenmerken een concept heeft en u plaatst het in een hiërarchie met behulp van overkoepelende functies. Dit proces geeft vorm aan de taxonomie.

I am text block. Click edit button to change this text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Typologie

A typology (in general) is a subdivision of a group of persons, descriptions, objects based on a number of characteristics. E.g. The Dutch cities can be divided by province or county (like cities in Limburg, Holland or cities in Noord Brabant…) according to population. Cities with over 500.000 inhabitants, cities with a population of 250.000 – 500.000 or other combinations.
Most groups of objects can be classified in many ways. Some typologies, however, are considered better than others. A typology with empty categories (e.g. cities in Limburg with more than 500.000 inhabitants) can be considered a weak typology. On the other hand, too many objects in a category also provide in a poor typology.

The terms, typology, classification system and taxonomy can be considered synonymous. In the domains of psychology, computer science/ artificial intelligence the distinction between these terms is made. The difference is to be found in the way they are created; taxonomy (empirical) or typology (conceptual).

It is possible that concepts that are related in a typology have no relation in a taxonomy. Let’s say if you define a typology of things you take along as a gift for a visit of a sick colleague than you expect to find concepts as apples, pears, flowers, and crossword puzzle magazines.

It’s not likely that you find those concepts combined in a taxonomy.

Folksonomy

A folksonomy is a system in which users apply public tags to online items, typically to aid them in re-finding those items. This practice is also known as collaborative/ social tagging, social classification or social indexing.

Folksonomy (when it was “invented”) was original “the result of personal free tagging of information for one’s own retrieval. The borderline between folksonomy and social tagging (tags in an open online environment where the tags of other users are available to others) is becoming vague. Folksonomy is commonly used in cooperative and collaborative projects such as research, content repositories, and social bookmarking.

The term folksonomy is a mix of the words folk and taxonomy.

If you define taxonomy as a way of managed metadata folksonomy is the opposite it is just a container of terms with no order but if you can derive the use of each term you can find meaningful terms for an organization and if you monitor the folksonomy you can promote words to the taxonomies.

Examples:

  • Twitter hashtags
  • Instagram
  • WordPress

In many features like Blue Kiwi, SharePoint, etc. folksonomies can be presented in tag clouds. Showing significance and use by size of the display.

Thesaurus

In the classical sense, a thesaurus is a kind of reference. A thesaurus is used to find the exact word for an object, a certain technical term or a word with the desired connotation (style considerations).

In modern times it is a tool through which unique concepts are linked by hierarchical equivalent and associative relationships. The term comes from the Greek and means treasure. It was initially established in linguistics as a logical-systematic (and alphabetically, but not explanatory) dictionaries: the concepts of language were categorized and compared to related concepts:

  • Synonyms; words that have a similar meaning. Sometimes people use the term data dictionary as a synonym for thesaurus
  • Hypernyms; words that describe a broader concept. Lexicon has a wider meaning than a thesaurus.
  • Hyponyms; words that have a narrower meaning. Synonyms list has a narrower meaning than a thesaurus,
  • Antonyms; words with the opposite meaning.

The term “thesaurus” is also used for a reference book with a specialized vocabulary within a particular interest- or professions, such as medicine or music. With the aid of a thesaurus the catalog of a library, for example, makes it more accessible than by means of an arrangement, which in the end is arbitrary.

For categorizing and reference one is not strictly bound by the terms (and the language) of a book or other media such as video or sound that contains no text or metadata.

A thesaurus can even assign multiple terms per publication or item of information.

Ontology

In computer science and logic, an ontology is the result of an attempt to define a complete and strictly conceptual scheme on a certain topic or domain. The word ontology is a term used in philosophy.

An ontology is typically a data structure, describing all relevant entities and their relations within the rules of the domain. In the field of artificial intelligence, the concept of ontology is used to describe the ‘real world’ in a way that a computer can comprehend. Another way to describe it is knowledge representation.

In a semantic web a computer needs to derive the meaning of either text or metadata from a model and based on that information it can calculate reasoning, effect or conclusion.

An ontology is used as a strict and complete model for a certain domain, mostly in a hierarchical structure, containing all relevant units and their relations and the rules that these units and relations need to comply to.

canonical models

A term used in data modeling but which in itself is difficult to provide a definition.

Words that approach the concept

  • Typical
  • Normally, normalized
  • Unique, unambiguous
  • A standardized way of displaying
  • According to acknowledged, accepted rules

It is also an adjective meaning that the subject is in accordance with the canon, the rules (originally ecclesiastical laws). Canonical issues are so believable, and so is a canonical model.

canonical used in information architecture

Information Architects often talk about canonical models that reality divides into concepts and relationships. A model makes reality visible. A canonical model is a clear conceptual model designed based on a standardized and common approach to something in a particular context (a piece of reality) with the result.

  • Clarity
  • Standardization
  • Common look
  • Context

A canonical model is unambiguous and therefore only explains one way. The meanings of the concepts in the model are based on a commonly agreed standard. Think of a typical description of a car. A car is a very complex thing, but following the model of “car” is quite universal.

The model brings the complexity of the car back to some key concepts related to each other. A typical car has a body, an engine, a steering wheel, a front axle with two wheels and a rear axle with two wheels. The steering wheel is connected with the front axle, and the motor drives one of the shafts or both at the same time. This model typifies a car. Every car meets this model. Indeed, tricycles do not, so the model is not universal, but within the context of a car maker which produces only four-wheeler vehicles.

A canonical model simplifies communication about things in a particular context (eg a company). Anyone within that context that the model does know what is meant when the concepts are discussed in this model. It prevents, said quite simple misunderstandings. The model is, after all, unequivocally.

How to get to these instruments.

Obviously it is very ambitious to create models that are accepted by all as credible. Since people have different views on the same issues. Each in its own perception. If the need is there, one model can form a good foundation for information architecture and has many advantages in IT architecture that are once interfaces now. That, of course, is great, but how do you develop a model where everyone in a company it is agreed? This is not easy.

No common IT discipline; involve professionals.

Where more than one person, because the phenomenon occurs that people have a different view of things. Often conflict the “images”, and requires one to watch a lot of detail, and the others are low.
If an organization is committed to instruments as described above it is important to consult experts such as:

  • Library scientists
  • linguists
  • Information technology expert with a penchant for the conceptual field of classification is preferred to professionals who have moved to the manufacturing side of the IT (e.g. developers).

The application of these techniques requests subject matter experts from within the organization. In this manner, one can set up the first version and present them as a starting point. Do not go all together to argue otherwise it never comes to anything. Accept expertise as you also accept the authority of example, an oncologist. Sometimes you have to prevent something going proliferate to keep it valuable.

hoe zit het met SharePoint?

term store

Taxonomy, thesaurus and are directly reflected in SharePoint. The term store is a tool that term sets (taxonomies) are housed and where thesaurus functionality is possible. The setup of SharePoint offers a central term store at the farm level or tenant. The site automatically inherit collections that are within the scope of this term store the terms of this central facility. Before you go up some term set or a series of term sets is wise to have purchased more than overnight to go over it.
Think first and then act.
The transfer of taxa from the specific (local term store) to a generic (or tenant farm level) is difficult.

Centraal (generiek) of lokaal (specifiek)

Each site collection contains a term store. And at each collection site, it is possible to make specific term sets. Realize the same time that specific term sets without customization are not accessible from other site collections.

Try as much as possible to work from a central but skip the dynamics do not try immediately to arrange everything. A useful tool to bridge the setting option on the library’s “Enterprise Metadata and Keywords Settings” thus an opportunity is created to grant unsorted metadata. The results are shown in the term store and thereby provide a source to maintain the managed/graded metadata.

Relevante links

Handleiding Microsoft over termstore
Technet pagina over metadata navigatie
Technet artikel over thesaurus
Cross site publishing en catalog feature metadata to the max

Leave a comment

You must be logged in to post a comment.

Dit is een blog van Wim Mulders, 1960, Valkenswaard Dommelen. Ik schrijf vanuit een 30-jarige achtergrond in proces- en informatieontwerp. Overdag werk ik als Enterprise Content Management consultant voor Atos met meer dan 10 jaar ervaring in Microsoft / Office 365 en SharePoint. Ik schrijf daar veel materiaal, maar op mijn eigen site hoef ik niet “diplomatiek” en volgens de bedrijfsnormen te zijn. Daarom gebruik ik deze site. Om mijn eigen bericht te schrijven, in mijn eigen stijl.

Als u vragen heeft dan kunt u dit formulier gebruiken. De dialoog wordt niet publiek gevoerd.

    Privacy Preferences
    When you visit our website, it may store information through your browser from specific services, usually in form of cookies. Here you can change your privacy preferences. Please note that blocking some types of cookies may impact your experience on our website and the services we offer.