Data Architecture COMN Sense: Relationships and Semantics

BLG01x - image - ed

Introduction

Relationships are fundamental to contextual modeling, data modeling, and semantics. It’s important to understand them deeply. This blog will explore the five aspects of a binary relationship type, illustrate fully labeled relationships in a model, and discuss relationships in graph databases and in semantics.

Modeling Relationship Types

Relationships have many aspects which can be named. Figure 1[i] shows a simple model of the employment  relationship type (in fact, as we’ll see later, it’s a bit too simple). Employment serves as a good example, because it is easy to come up with a name for each of its aspects.

  • name of relationship type: Employment
  • role 1: played by a party of type Employer
  • role 2: played by a person of type Employee
  • reading direction a: Employer employs Employee.
  • reading direction b: Employee is employed by Employer.

Let’s look at these 5 aspects individually.

Figure 1. The Employment Relationship

Figure 1. The Employment Relationship

In an English sentence, an employment relationship type can be read in either of two directions. The two reading directions “employs” and “is employed by” are given alongside the employment relationship line. Neither reading is preferred over the other, except perhaps by the fact that one uses fewer words—but from a modeling perspective, they are exactly equivalent.

It’s important not to swap the two roles in an employment relationship, because it doesn’t make sense to say “Employer is employed by Employee.” We say that employment is an asymmetrical relationship type. In contrast, friendship is a symmetrical relationship type. “Sam is a friend of Joe” means exactly the same thing as “Joe is a friend of Sam.”

Each participant in a relationship plays a role in the relationship, in much the same sense that actors play roles in movies. In general, playing a role in a relationship does not change the nature of the entity playing the role. For example, it is a person who must play an employee role. The person is a person whether or not the person is playing the employee role, has played the employee role, or will play the employee role. This reveals some ambiguity in our English words for roles and entity types. It is not uncommon for a person to introduce himself by saying, “I am an employee of company X.” When we break down this statement, we realize that the person who is claiming to be an employee is really saying that he plays the role of an employee in some employment relationship. The employee is still a person. Therefore, a more accurate model of employment would be as shown in Figure 2. This model shows that Employee Person designates that subset of persons who play the employee role in an employment relationship. Similarly, an Employer Party is that subset of parties (persons or organizations) who play the employer role in an employment relationship.

Figure 2. A More Accurate Model of Employment.

Figure 2. A More Accurate Model of Employment

It’s often convenient to name the entire relationship type (in this case, Employment). If data about relationships is to be kept—for instance, the date on which an employment relationship begins—then the record collection for that relationship data can be named after the name of the relationship type. If not, the name of the relationship type might not be very important. See Figure 3 for an example model of data about this employment relationship type.

Figure 3: Employment Data

Figure 3: Employment Data

Relationship Facts and Graphs

It can be convenient to use one of the reading directions as part of an assertion of a fact about individual parties to a particular relationship; for example, the expression “employs(LexisNexis, Ted Hills)” is a formal way of saying that LexisNexis employs Ted Hills. The expression “isEmployedBy(Ted Hills, LexisNexis)” says exactly the same thing; the form of expression is reversed, that’s all. A model of this fact is shown in Figure 4 below. This model is a graph model where the hexagons are graph nodes and the relationship line is a graph edge.

Figure 4. Model of a Fact; Instance Model; Graph Model

Figure 4. Model of a Fact; Instance Model; Graph Model

Relationships and Semantics

The Web Ontology Language (OWL) is widely used to represent semantics (roughly, meaning) on the Internet, and is based on something called description logics. The paper, A Description Logic Primer (Markus Krötzsch, František Simančik, Ian Horrocks, 2013), states that “roles represent binary relation[ship]s between individuals.” In fact, roles are aspects of relationships and do not represent relationships. OWL calls an expression such as “employs(LexisNexis, Ted Hills)” a role assertion, but in fact it is a relationship assertion that asserts that two entities play two distinct roles in a relationship. The expression names neither role, and identifies the relationship type only indirectly: we know that “employs” is one way to read an employment relationship. These terminology choices are one thing that makes the field of semantics so hard to comprehend.

It is straightforward to see how a graph database can represent relationships between any two things (binary relationships). Since OWL expresses all relationships as binary relationship assertions, there’s a natural affinity between OWL and graph models, leading some graph DBMS vendors to tout graph databases as well suited to semantic processing. However, any relational database can represent relationships between two things, so this can’t be what’s special about graph DBMSs. Graph DBMSs are set apart by their ability to use recursive queries to discover such things as the shortest path between two nodes, the strength of the connections between two nodes, and the degree of similarity between two graphs. It should also be noted that relational databases can represent relationships involving more than two participants, and can represent data about relationships—things that are possible but not so straightforward with graph databases.

Next month we’ll look at how relationships appear in structured and unstructured data, and how we can harvest such relationships for meaning.

I’m starting this monthly blog to talk about data architecture and data modeling topics, focusing especially, though not exclusively, on the non-traditional modeling needs of NoSQL databases. The modeling notation I use is the Concept and Object Modeling Notation, or COMN (pronounced “common”), and is fully described in my book, NoSQL and SQL Data Modeling (Technics Publications, 2016). See http://www.tewdur.com/ for more details.


[i] Models in this blog use the Concept and Object Modeling Notation (COMN), more fully described at http://www.tewdur.com/.

Copyright © 2016, Ted Hills

Share

submit to reddit

About Ted Hills

Ted Hills has been active in the Information Technology industry since 1975. At LexisNexis, Ted co-leads the work of establishing enterprise data architecture standards and governance processes, working with data models and business and data definitions for both structured and unstructured data.

Top