Language is often ranked as one of the most remarkable creations of human beings. Our ability to symbolize, in turn, opens up a variety of ways to create. Language plays a tremendous role in human affairs. It serves as a means of cooperation and as a weapon of conflict. With it, men can solve problems, erect the towering structures of science and poetry, and talk themselves into insanity and social confusion.
The verbal world may be considered a map of the extensional world. Like any model used to represent something other than itself, language ‘fits’ better in some places than in others, but at no place is there a perfect correspondence between language and the world ‘out there.’ When we talk or write, we tend to act as if we know ‘all’ about a subject, as if we have said all about it. When we use words, we abstract – that is, we select and leave out things.
The action of putting things that are not identical into a group or class is so familiar that we forget how sweeping it is. These are just some statements from the outline survey of general semantics by Kenneth Johnson. They show us that language is key in communication and conveying information.
General semantics is a field of knowledge that is based upon the ideas of Alfred Korzybski. It is a practical discipline that applies language strategies and modern scientific thinking to solve problems in everyday life. It brings about clearer thinking, better speaking, improved communication, more peaceful interaction, and greater sanity to one’s life. Key ideas within general semantics are that we should be aware of our tendency to abstract, that things are specific to a moment in time, that no two things are identical, and that no word has exactly the same meaning.
There is a large body of knowledge related to general semantics— I won’t even try to suggest that I have given you a proper idea of what it is. The purpose of this blog is to make you aware of the importance of language for information and, at the same time, warn you of its problems. Language can help the data management professional and is gaining increasing interest under the flags of Big Data and natural language processing. However, this is often approached from a technology perspective, with little awareness of the intricacies of language.
My personal interest in language stems from my interest in trying to understand people and the way that they think. I also became aware that a lot of what I do revolves around using language for conveying information. I usually spend a lot of time trying to find the right words in writing texts, but also in statements such as goals, principles, guidelines, and requirements. I also became more interested in conceptual modeling, which is not a coincidence. In particular, a conceptual data model is really a structured way of describing the real world, which is very close to our language. A good conceptual model uses the terms that are part of normal business language. This ensures that it is a good reflection of the universe of discourse. It separates describing reality from decisions about how to represent reality in data or using technology.
Such separation of concerns increases the quality of models by preventing irrelevant discussions for a specific model at hand. A conceptual data model is very different from a logical data model. In practice, I see a lot of data models that are a mix of both. A common symptom is a conceptual model that uses abstractions that do not exist in everyday language. The result is that these models are hard to understand and validate with people from the organization that are involved in the business process. A good conceptual model is really an information model; it provides meaning to its target audience.
Language also propagates to other parts of data models, most importantly the definitions of the concepts. It is very important that these definitions are understandable. As mentioned before, using only business terminology is an important part of that. Definitions should state the essential meaning of the concept, be precise, be unambiguous, and be concise. They should avoid circular reasoning or embedding definitions of other model elements. General semantics provides us with further guidelines for formulating definitions and other texts.
Let’s consider some of these. We should prevent talking in extremes; that two things are completely opposite, or completely identical. In reality, there are a lot of nuances; we only see some of the characteristics of objects, not all there is to know about them. Similarly, there is often not one single cause for a certain phenomenon. We tend to use terms such as “and”, “plus”, and “also” to suggest that elements can simply be added to each other. We also use language to split things, that cannot be separated in reality. Humans are also bad at separating facts (based on observations by multiple people) from inferences (our interpretations of facts). The bottom line is that ultimate definitions do not exist because words do not have “one true meaning”. They mean different things to different people. They mean different things at different times. And they mean different things in different contexts.