SELECT * FROM Celko – June 2010

I just finished reading Only a Theory by Kenneth R. Miller (ISBN:978-670-01883-3). If you don’t know Dr. Miller from NPR and some television appearances, he is a professor of biology at Brown University who was the expert witness at the Dover, Pennsylvania, text book trial in 2004.

If you don’t remember the case (Kitzmiller v. Dover School District), Creationists attempted to subvert the school textbooks. The Dover school board ordered science teachers to read a statement to high school biology students suggesting that there is an alternative to Darwin’s theory of evolution called intelligent design – the idea that life is too complex to have evolved naturally and therefore must have been designed by an intelligent agent.

You can see the NOVA television show online or buy a DVD.

Intelligent Design depends on the concept of “Irreducible Complexity,” which says that some things are so complex that they could not have evolved and therefore have to have been created all at once. Dr Michael Behe from the Discovery Institute argued in 1996 that the human immune system was an example of such a system. But when presented with 58 peer-reviewed papers, nine books and immunology text books showing the evolutionary path, his response was that none of this was sufficient for him and that it was “not good enough,” but he never said what would satisfy him.

Have you ever worked with people like that?

But aside from a look at politics, theocracy, modern biology and what science is, the book gives a good overview of what evolution is. William Dembski is another Creationist associated with the Discovery Institute, but he made assertions that are testable. In particular, he coined “The Law of Conservation of Information,” which says that random processes and natural laws could not produce Complex Specified Information (CSI), like a phone book.

For us computer types, there are computer simulations such as EV done by Thomas Schneider at the National Institute of Health. His program takes a random mix of the DNA base proteins (A, T, G, C) as the first generation. Then each generation is formed by retaining the proteins that have good binding properties and throwing out about half that do not. Then that generation gets a few mutations and the program repeats the cycle.

As the program runs thousands of times, the amount of information – in the formal mathematical information theory sense of the word information – in the system increases over time. If you remove the selection process, the information level will drop back down to near zero. There are other programs like this and they all disprove Dembski’s Law of Conservation of Information.

The Evolution of Useful Things: How Everyday Artifacts – From Forks and Pins to Paper Clips and Zippers – Came to be as They Are by Henry Petroski (ISBN: 978-0679740391) shows how the selection of the market can evolve manufactured goods. My favorite example is the pop top on drink cans. The first commercial version to gain any popularity was an aluminum tab that looks like a small tongue wearing an ear ring. For a period in the 1960s, they littered the landscape of any public meeting area that involved drinking beer as part of the social ritual. If people did not litter, they had the habit of putting the tab into the can. Extracting aluminum tabs from choking drinkers became a regular event at emergency rooms around the world. Step by step, the pop top evolved into its modern (non-choking) form that we know and love on our beer cans. As the title implies, Petroski traces other “everyday artifacts” and how they got to their current forms.

What all this shows is that evolution is a general  mathematical property of systems that have certain characteristics. The system has to have members who reproduce (replication, manufacture), with changes in the next generations (mutation) and a selection process (natural selection, markets) to filter out “failed” members.

Let me consider a database as an environment and my data as members of a population. The gimmick with evolution is that it requires a “large enough population” to work. The definition of “enough” is a bit vague, but we need enough members to let things work out. This is why so many branches in the tree of life end in extinction.

The classic example that most of us have seen is the survey where you sit and rate suggested books. One of your options is that you already own the suggested book. Owning a book and wanting to buy another one like it are not the same thing. I have picked turkeys myself or been given a stinker by a well-meaning friend or gotten a copy because I was late returning a book club replay card. Buying a book and owning it are not the same thing. I get gifts for other people but do not ship them directly to them. Fancy analysis tends to try to fit a customer into one category. In marketing, these categories have colorful names like “white picket fences,” “empty nesters,” “shotguns and pick-up trucks” and so forth.

But what if we evolved a customer profile instead of setting up categories? Consider a program to build a “paper doll” from assorted outfits and options to help you define your fashion look. The selection process is simply to ask which images were most pleasing to the user and retain them with some mutants for the next generations. There is no attempt at analysis about the user’s style preferences by established category. If the user likes bright solid colored mini-skirts, then they will be favored over dull patterns on pants by natural selection. If the user has two niches, they will both survive.

Play with the idea and let me know what you think.


submit to reddit

About Joe Celko

Joe is an Independent SQL and RDBMS Expert. He joined the ANSI X3H2 Database Standards Committee in 1987 and helped write the ANSI/ISO SQL-89 and SQL-92 standards. He is one of the top SQL experts in the world, writing over 700 articles primarily on SQL and database topics in the computer trade and academic press. The author of six books on databases and SQL, Joe also contributes his time as a speaker and instructor at universities, trade conferences and local user groups. Joe is now an independent contractor based in the Austin, TX area.