It is fundamental human nature to form relationships. Forming a relationship between two people is based on transparency, mutually beneficial outcomes, and trust.
The world of online semantics is modeled on the way human beings think and interact. As I have described before, the nature of the semantic structure is based on the subject-predicate-object triple. The subject “I” and the object “you” can have a predicate describing the relationship: love, hate, admire, despise, married, see, etc. This relationship is an action.
If we think of the construction of controlled vocabularies based on semantic standards in this way, we can understand the importance of relationships in ontology modeling.
Hierarchical, equivalence, and associative relationships link things to other things. These relationships can be between two objects, typically two concepts in one or more schemes. They can also between objects in a controlled vocabulary and external objects. For example, a conceptual relationship in an ontology can have a connection to a tangible content asset in internal and external systems tagged with metadata concepts from the ontology.
The ANSI/NISO Z.39.19-2005 (R2010) Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies standard defines some basic relationships for use in controlled vocabularies.
A hierarchical taxonomy really only has one necessary relationship, and that is the hierarchical Broader Term (BT)/ Narrower Term (NT). In the SKOS standard defined by the W3C, the broader/narrower relationship is defined by skos:broader and skos:narrower and can have the labels has broader and has narrower. Broader and narrower relationships can also be polyhierarchical. That is, a concept can have two different broader terms in the same scheme. Hierarchical relationships can be generic (BTG/NTG), instance (BTI/NTI), or whole/part (BTP/NTP). I’ve described the general principle of a generic relationship in my blog post here. An instance hierarchical relationship may point to a specific example of the concept, such as bridges > Brooklyn Bridge. A whole/part relationship may express a part of a larger item, such as automobiles > automobile tires.
Thesauri (and in practical use, taxonomies) include equivalence relationships. The USE (U)/ USED FOR (UF) relationship expresses equivalence between terms and is used to indicate the preferred term and its associated non-preferred term. In SKOS, these are defined as the preferred label and the alternative label. In OWL, it is possible to express an equivalence relationship using the owl:sameAs relationship. The Use/Used For relationship covers synonyms, lexical variants, near-synonyms, generic postings, and cross references to elements of compound terms.
For most of us who are taxonomy practitioners, we include the associative relationships as defined for a thesaurus in our definition of a taxonomy. Thesauri (and, again, in practical taxonomy use) include associative relationships as defined by Related Term (RT). A related term is an association between terms which is neither hierarchical nor equivalent but shows concepts are semantically or conceptually associated. In SKOS, the associative relationship is skos:related with a label of has related. The standard lists many types of associative relationships, all of which may be covered by related to.
Other Standard & Custom Relationships
While the ANSI/NISO Z.39.19-2005 (R2010) doesn’t include other types of relationships, it also doesn’t exclude them for use in controlled vocabularies. Ontologies often require greater specificity of relationships between concepts. Beyond the standard broader/narrower, related, and equivalency (prefLabel and altLabel), an ontology may need uni-directional or inverse relationships to express the connection between concepts. Ontologies and their use as a basic underpinning for knowledge graphs are strongly relationship-based.
We can look to additional semantic standards like those defined by W3C for relationship definitions for use in controlled vocabularies. SKOS provides a standard way to represent knowledge organization systems using the Resource Description Framework (RDF). SKOS includes relationships for semantic relations and mapping relations. These include hierarchical, associative, and versions of equivalence as used in taxonomies. Encoding this information in RDF allows it to be passed between computer applications in an interoperable way. OWL (Web Ontology Language) defines ontological properties and relationships, including mapping relationships like owl:sameAs.
The ability to create custom relationships allows us to use predicates from W3C-standard resources (SKOS, OWL, FOAF, Schema.org, and others) and our own relationships to describe concepts and the relationships between them. There are properties and relationships “out in the world” that you can adopt and re-use in your own ontology. This has the virtue of allowing us to use the same relationship to describe the same things and makes ontologies easier to combine.
Focus on Relationships
Defining such basic concept elements as the preferred label, alternative label, definition, scope note, and other properties to describe concepts in a controlled vocabulary is a basic premise of the practice. So are the relationships between concepts. As we get into modeling inter-related concepts and schemes and our models become more complicated, a focus on defining relationships becomes critical.
A frequent modeling problem centers around whether to pre-coordinate or post-coordinate concepts. Pre-coordination can take the form of creating a single label with many pre-defined concepts or creating navigational paths which help users browse to a piece of content (which may be an article, product, or any other resource). It is easy to see pre-coordination in action on a retail website allowing users to pick categories and browse down a hierarchy to a specific product, such as Appliances > Small Kitchen Appliances > Coffee Makers. Post-coordination is typically in the form of one or more subject taxonomies in which individual single or multi-word concepts live contextually. Post-coordination works well for search, in which a user may skip the higher navigational levels and enter the keyword “coffee makers” in order to get results.
Websites are made to support both, but complex subject environments quickly break down with attempts at pre-coordination in the vocabularies. For example, including navigational paths for Women’s Clothing > Shirts, Men’s Clothing > Shirts, and Children’s Clothing > Shirts is fine for navigation, but how does this look in a hierarchical taxonomy? Is the concept “shirts” a single polyhierarchical concept or is it three separate concepts in three contexts? The problem is infinitely compounded when we have similar hierarchies which fork. For example, Men’s Clothing > Shirts > T-Shirts may have the same path for women, but Women’s Clothing > Shirts > Blouses may not have the same path for men. The more variable attributes we attempt to build into the pre-coordinated paths, the more complex the vocabularies become.
Using post-coordination and leaning heavily into relationships not only makes the problem easier to model, it also allows us to adhere more closely to the standards for vocabulary management. In the above example, we may have separate, hierarchical vocabularies for Audiences (Men, Women, Children) and Products (Jeans, Shirts, Dresses). Post-coordination is achieved through specific relationships like Audiences hasProduct Product and Product hasAudiences Audiences. We can also choose to create separate vocabularies for attributes and determine whether they should be in one vocabulary or in separate vocabularies modeled with independent and different properties. For example, Colors may merit its own vocabulary if the property for Color Code only applies to those concepts. Rather than tracking which property goes with each concept or hierarchical branch in a vocabulary, it is often easier to separate concepts into separate vocabularies based on their attributes.
One result of post-coordinating concepts is the potential for a proliferation of schemes and specifically defined relationships. While this presents some usability issues such as long lists of vocabularies, the benefits of reliance on relationships to convey information often outweigh the drawbacks.
Specifically named uni- or bi-directional relationships convey their own information whereas more general associative relationships like has related do not. Saying that “jeans” is related to “blue” is far less specific than saying that “jeans” has color “blue”. Labeling the relationship so that it is semantically understandable to the end user is just as important as the relationship being more specific and differentiated from other relationships.
Many, discretely defined relationships bring specific clarity to the relationships between objects and allows for clearly stated connections lending themselves to easily intelligible knowledge graphs. The key to this approach is relationship management. Using publicly available standard relationships such as has related (or related to) has its merit, but becomes much more powerful when complemented with custom relationships. Managing this combination of relationships requires active ontology modeling and a thoughtful approach to defining and using relationships.
Consider the use of these relationships and which should be bi-directional, inverse relationships and which should be uni-directional. Something like has film is more general and can apply to actors, directors, character appearances, cameos, and so on and perhaps should be a single direction relationship to different entities. This is especially important if this more general relationship is going to be used across entities with different characteristics. Consider the following different entities using the same relationship:
Actor has film Film
Theme Song has film Film
Character has film Film
Director has film Film
Producer has film Film
While some of the entities are more generally “People”, others are “Things” and have unique properties and characteristics. It may be useful to use the same relationship in many ways without a distinct inverse relationship. This could also be modeled more specifically using sets of corresponding uni-directional relationships, such as
Film has actor Actor
Film has theme song Theme Song
Film has character Character
Film has director Director
Film has producer Producer
The bi-directional relationship of has actor and is actor in, on the other hand, is much more specific and may not cause any conflicts later in the modeling. In other words, the chances are better that the relationship will only apply to an actor name and film and television titles in which the actor has appeared.
When modeling complex environments, consider making your relationships work harder to create a more semantically understandable and flexible knowledge domain while balancing this power with good relationship management.