Things in Threes
There is an expression that good things come in threes. There is also an expression that bad things come in threes. In Semantic Web standards, specifically RDF, the atomic data entity is a tripartite structure called a semantic triple. The triple consists of three entities in a subject-predicate-object expression. All things in RDF come in threes: good, bad, or neutral.
What is the advantage of using a triple as the semantic base unit in graph databases? The answer lies in language and basic semantic units.
In writing (and other disciplines, such as photography), there is the concept of the “rule of three” or omne trium perfectum in which things in threes are more effective or memorable than items grouped in other numbers.
A good example of this is the Welsh Triads. The Welsh Triads are groups of related medieval texts conveying a mixture of history, folklore, and mythology structured as a preface explaining the relationship between the three items and then a listing and explanation of the items themselves. These texts were initially composed in triples as a mnemonic device for Welsh bards to remember the content they recited orally. Here are some examples, and here’s a bonus poem about some bards of Wales. What started as a device became a convention.
What is of interest is the use of three to draw similarity and semantically link the items. Haikus presented in English also use three-line stanzas, and there is the rule of thirds in photography, just to point out two other examples. There seems to be something elemental in using a tripartite structure to convey meaning.
In linguistics, the study of word order focuses on how the constituent order of a clause is structured, especially the order of the subject, object, and verb. In English, the basic word order is subject, verb, object (SVO). Since English is widely used, and I come to this analysis as a native English speaker, it would be easy to believe SVO is the most common word order. However, this is not the case. The most common word orders are broken down as follows:
Of course, English is not strictly rigid in its word order. When the syntax changes, the semantics can still be deciphered, even by non-native speakers. The flexibility of syntactic word order is also not unique to English.
It follows that since the W3C (World Wide Web Consortium), the developer and publisher of semantic web standards like RDF, was founded in the United States, the origin of the preferred triple format subject-predicate-object mirrors the basic SVO structure of English.
The basic elements of the three SVO parts convey a lot of semantic information in a small package. As with the Welsh Triads, the semantic expressivity of RDF triples allows us to connect things to other things (for ontologists, these things are usually concepts) with a meaningful relationship. RDF triples allow knowledge to be represented in a way that is both machine readable and intelligible to human users.
Although it is termed subject-predicate-object in RDF, the SPO structure can have other names, as depicted in the following diagram. Nodes and edges are preferred terms in graph theory, the foundation of triplestores (RDF graph databases) used to store information in triples. We can also refer to subjects and objects more generically as entities connected by a relationship.
Using the SPO structure, it is easy enough to convey facts, which I will present in plain English to get the idea across simply. For example:
Ahren Lehnert livesIn Oakland
We can clarify this statement further by adding more triples which could be concept to concept or concept to property and property value. For example:
Ahren Lehnert hasBirthDate 1973–07–09
In which the person represented by the concept “Ahren Lehnert” is connected to the date represented by the concept “1973-07-09″ by a relationship hasBirthDate.
Ahren Lehnert has Birth Date: 1973-07-09
In which the person represented by the concept “Ahren Lehnert” is connected to the date represented by the property value “1973-07-09″ by a relationship has. And,
Oakland hasState California
In which the city represented by the concept “Oakland” is connected to the state represented by the concept “California” by a relationship hasState.
(Ahren Lehnert hasBirthDate 1973-07-09) livesIn Oakland
Thus, the basic tripartite structure, when expanded by connecting subject to object to subject to object through semantically expressive relationships, can convey rich meaning. While it might take some time, even a sentence by Faulkner could be represented by a graph of triple statements.
Ontologies are controlled vocabularies built on Semantic Web standards using flavors of RDF (SKOS or OWL). Thus, they semantically link concepts to classes, concepts to other concepts, concepts to properties, and any thing to another thing using triples. Hence, they are semantically expressive and conceptually not difficult to model…until you get into very complicated domain or intra-domain realms, that is.
One of the fundamental benefits of these RDF-based controlled vocabulary structures are their dual nature as being both human intelligible as well as machine readable (and, therefore, portable). Like their basic constituent subject-verb-object basis, ontologies convey rich meaning in relatively simple and compact statements, making them extremely useful in modeling complex knowledge environments.
Good things come in threes!