SELECT * FROM Celko – December 2012

I spoke at the Data Modeling Zone in Baltimore in mid November. I did two of my standard presentations (“Scales & Measurements” and “Design of Encoding Schemes”), which were better suited for a data modeling conference than some of my usual SQL programming tricks presentations. My sessions went well. I got to eat fresh oysters and crabs. When you live in the middle of Texas, this is a big deal.

The best presentation was by David Hay (Essential Strategies, Inc.), and it was the last one of show. I have had that final spot, and I know how hard it can be. The audience is ready to go to dinner, the bar or the airport (a few have checked out of the hotel and have their baggage with them). David did not lose a single person.

He developed a conceptual data model for baseball cards. As he added more detail to the model, he changed it. We started with a Player who is hired by a team for a position. Then he starts to play, and we collect statistics on him in each game. I now realize that I do not have any idea how baseball is really played! Players collect one set of statistics as a batter and as a pitcher. Then the players start moving from team to team, from position to position and we have to track them. The team has its own stats. Arrgh!

David develops the model so nicely that you are not worried about the complexity of a sacrifice fly versus a sacrifice bunt, or a looking strike versus swinging strike versus foul strike. You see that the structure of the model can handle everything without requiring a re-write.

I guess this is one reason I stuck to “Mars Attacks” bubble gum trading cards when I was a kid. Once we fought off the Martian invasion, the set was complete and you could get on to “The Outer Limits” card series.

When you get immersed in one thought pattern, you see the whole world through that tool for a few weeks afterward. I was drawing boxes and arrows for everything. When I got back home, I had a pile of “to be read real soon now” books on my desk.

I grabbed The Geography of Thought: How Asians and Westerners Think Differently… and Why by Richard Nisbett (ISBN-13: 978-0-7432-5535-6) and had another immersion experience. Nisbett tries to explain differences in ways of thinking between Westerners and East Asians. His premise is that Greece had a heterogeneous population and settled social things by public debate, while China had a homogenous population that settled social things by public consensus. Aristotle versus Confucius!

He then presents experiments and explains their results in those terms. If you are old enough, you will remember the “Dick and Jane” elementary school readers. The first page says “See Dick run. See Dick play. See Dick run and play.” We do not get to Spot the dog and Sally the baby sister for awhile. The equivalent Chinese primer of the same time period starts with two boys, one giving the other a piggy back ride with the text “Big brother takes care of little brother. Big brother loves little brother. Little brother loves big brother.” Westerners start from individual actions while the Chinese start with relationships.

We also like to make categories and use hierarchies. Let me give you another quote, from the Essay “The Analytical Language of John Wilkins” by Jorge Luis Borges: “These ambiguities, redundancies, and deficiencies recall those attributed by Dr. Franz Kuhn to a certain Chinese encyclopedia entitled Celestial Emporium of Benevolent Knowledge. On those remote pages it is written that animals are divided into (a) those that belong to the Emperor, (b) embalmed ones, (c) those that are trained, (d) suckling pigs, (e) mermaids, (f) fabulous ones, (g) stray dogs, (h) those that are included in this classification, (i) those that tremble as if they were mad, (j) innumerable ones, (k) those drawn with a very fine camel’s hair brush, (l) others, (m) those that have just broken a flower vase, (n) those that resemble flies from a distance.”

This is funny to Westerners because it has no organizing principle for the categories. Think about Aristotle and his genus and species model of definitions. But an East Asian would think in terms of fluid relationships among objects playing roles and not in terms of properties the object innately has by its nature.

In one experiment, children in the U.S., China and Taiwan are given the list {panda, monkey, banana} and asked to pick two words from the list that go together. The American kids preferred {panda, monkey} because they fit into the animal category. The East Asian kids preferred {monkey, banana} because they have a relationship: monkeys eat bananas.

Nanometer experiment consisted of showing American and Japanese kids a pyramid made of cork. They were told that they were looking at “Dax,” a nonsense word to which they would have assign meaning. They were then shown two objects: a different geometric shape made of cork and another pyramid made of white plastic. The task was to pick the “Dax” from the pair. Americans tended to pick by shape and grabbed the plastic pyramid. The Japanese kids tended to pick the other cork object.

The Westerners sees discreet, unconnected objects, while the Japanese kids saw a world based on substances (cork) that can flow into other relationships (shape). This mind-set also shows up in language learning. Western infants learn nouns more rapidly than verbs, and the reverse is true for East Asians infants. That is stranger when you consider that Japanese and other East Asian languages have only one grammatical form for a noun while inflectional Western languages can have dozens of forms. But then is very hard to for a Japanese speaker to refer to himself outside of a situation; there nothing like the universal, non-situational first person pronouns of Western languages.

When I start a data model, I make a list of the entities first. Then I look for a key and fill in the attributes. Then I add columnar and row constraints. If I need auxiliary tables, such as a calendar, report periods and function look-ups, then I copy them from my library. The last thing I add are the relationship tables with the PK-FK constraints and DRI actions.

I always thought I had a very “set-oriented” approach to databases, but I never realized that I am a victim of a Westerner mind-set, too! I wonder if we could design a database with an Asian mind-set?


submit to reddit

About Joe Celko

Joe is an Independent SQL and RDBMS Expert. He joined the ANSI X3H2 Database Standards Committee in 1987 and helped write the ANSI/ISO SQL-89 and SQL-92 standards. He is one of the top SQL experts in the world, writing over 700 articles primarily on SQL and database topics in the computer trade and academic press. The author of six books on databases and SQL, Joe also contributes his time as a speaker and instructor at universities, trade conferences and local user groups. Joe is now an independent contractor based in the Austin, TX area.