Not just NPTs: Thinking in Labels
The taxonomy standards (ANSI/NISO z39.19, ISO 25964-1, et al.) were written using the standard thesaural relationships and their abbreviations: BT, NT, RT, NPT (or “Use For”). Accordingly, the first generation of taxonomy software used these relations as structural building blocks for designing and storing taxonomies. Although most such tools could export to SKOS (and other RDF-based formats), the fundamental design adhered to the standards.
With the widespread adoption of RDF (including but not limited to SKOS) and RDF-based taxonomy management, this is changing. In addition, the application of semantic structures beyond information tagging and retrieval and the rise of ontologies and graph-based applications for KOS (Knowledge Organization Systems) admits (and sometimes requires) more specific relationships between concepts.
Compared to the standard thesaural relations, the relationships outlined in SKOS, as well as other vocabulary-centric ontologies, allow greater expressivity and nuance in describing relationships between concepts.
For example, in a traditional thesaurus NPT (Non-Preferred Term), or UF (Use For), is used to describe all alternative versions or labels for a concept: synonyms, alternative spellings, lexical variants, abbreviations, acronyms, and any other versions of a concept’s label that is useful to store for search (including query expansion), browse, and other operations. This can include things like common misspellings and even more granular terms which did not warrant inclusion in your vocabulary but could be useful (like breeds of dogs or types of cookware). It is common to find something like:
UF Cook ware
UF Cooking utensils
UF Frying pans
UF Stock pots
The goal here is to capture variants of a concept’s label that are useful for exploding search queries, NLP operations, auto-classification of documents, and so on. And, for this purpose, lumping all of the variants together makes good sense: they’re all doing the same thing.
However, it is useful to be able to separate types of NPTs for different uses. Misspellings are useful for search queries (think about how Google finds stuff even when you misspell your search string) but should not be surfaced as alternatives to users (as it looks terrible).
SKOS, in contrast, features several ways to describe labels, each of which has different uses.
skos:prefLabel is generally used for the preferred version of a term: the primary way of describing a concept in the hierarchy. It is useful to know that skos:prefLabel can also include a language tag (comprising the two-letter ISO language code). For example, to mark a label as English we use the annotation @en. In the same way, we can store “alternative” preferred labels in other languages (@es for Spanish, @de for German, and so on). These foreign-language equivalents can be useful for some types of queries and text processing. Moreover, if you provide such labels for every concept in a vocabulary you can publish it in more than one language; this is common in, for example, European Union vocabularies like the UNESCO thesaurus which is available in six languages.
skos:altLabel is used to capture and store spelling variants, acronyms, abbreviations, irregular plurals, and sometimes other lexical variants like the practitioners of disciplines (e.g., Psychology altLabel: Psychologists). Generally, altLabel includes things you might want to display and are useful for search redirects, type-ahead functions, and various NLP operations
skos:hiddenLabel is used for anything you would not want to display but could be useful for back-end processing, search indices, or NLP, such as common misspellings, part-of-speech truncations, and lemmatizations.
It is also possible to declare your own types of alternative labels (beyond those outlined in SKOS) to differentiate versions of a concept for other purposes.
For example, if you want to include narrower terms as alternative versions of a concept–terms that do not have sufficient warrant to be included, but could be useful for text processing or search queries–we could declare a term label like “includesNarrower” to differentiate these from the other altLabels discussed above.
Returning to the example above, “woks”, “stock pots”, and “frying pans” are all examples of more specific kinds of Cookware. If I don’t have content or products specific enough to warrant inclusion of these as separate concepts but I want to redirect search queries for “Frying pans” to “Cookware” I can use includesNarrower to store these more granular versions of the concept separately from those using the generic altLabel.
Other custom labels (or reusing label types from other ontologies!) is also possible; you might consider specific label types for acronyms, abbreviations, or any other alternative labels that are useful to store and use separately.
We can now express the labels in the Cookware example in a more specific and useful way:
prefLabel @en Cookware
prefLabel @de Kochgeschirr
prefLabel @fr Batterie de cuisine
hiddenLabel Cook ware
altLabel Cooking utensils
includesNarrower Frying pans
includesNarrower Stock pots
In general, the shift in focus from terms to concepts with labels is tricky and requires some mental gymnastics to get used to. But the flexibility (and greater specificity!) of this approach can provide useful implementations of your vocabulary.