SELECT * FROM Celko – June 2013

I have a new hero: Nassim Nicholas Taleb. I just ran into his books a few years ago and cannot get enough. Among his academic honors, he is currently Distinguished Professor of Risk Engineering at Polytechnic Institute of New York University. That gives you an idea of what he deals with in his books, both technical and popular.

Taleb’s first popular book was Fooled by Randomness (2001, Updated 2008; ISBN-13: 978-1400067930) about the underestimation of the role of randomness in life. In particular, it looked at the stock market bubble and how people who made money during that period would attribute their success to their superior judgment, and try to rules for their success. This is what Michael Shermer called “Patternicity” in a Scientific American column (2008 December). It is also known afind s “hindsight basis,” which means that the past looks organized, deterministic and predictable. You can filter, trim and drop details to use whatever pattern you wish on the data.

What is strange is that you even can be a guru by getting just one prediction correct or without getting any predictions correct if you can sell your pattern to your readers. For example, Paul Krugman is a Nobel-prize winning economist who said in 1998, among other things, “By 2005 or so, it will become clear that the Internet’s impact on the economy has been no greater than the fax machine’s.” He has been rewarded with a regular column in the New York Times.

Another one of my favorite “failing upward” gurus is butterfly expert Paul R. Ehrlich, who wrote The Population Bomb (ISBN-13: 978-0345289995, 1968, revised 1980). It predicted that we were doomed from overpopulation, and it was the bible for left-wing greens in its day. Not only were his predictions wrong, but also their opposites are true. He is also known for the famous Simon–Ehrlich wager. Ehrlich bet that the prices for five metals (nickel, copper, chromium, tin and tungsten – chosen by Erlich) would increase over a decade (1980 to 1990) because of population pressure, while economist Julian Simon took the opposite stance. Between 1980 and 1990, the world’s population grew by more than 800 million people, the largest increase in one decade in all of history. When it was over, Paul Ehrlich mailed Julian Simon a check for $576.07 to settle the original $1000 wager.

Taleb’s second book, The Black Swan: the Impact of the Highly Improbable (2007, second edition 2010, ISBN-13: 978-0812973815) is about unpredictable events. The title comes from a standard 17th century thought experiment. It was accepted as a truth that “All swans are white” and it was used in text books, along with “All men are mortal” and other similar stock phrases. So what were the odds of seeing a black swan? While more possible than living forever, it was not thought to be likely. Then in 1697, explorers found Cygnus Atratus in Australia!

We think in terms of a normal (Gaussian, Bell Curve) distribution, where there is a “long tail” after three standard deviations. This is a long tradition in financial literature as well as sciences. In fact, the French mathematical Louis Bacheller’s models referred to the outliers as “contaminators” in the early 1900s.

Hey, this distribution has been pretty good for physical problems, like the height or weight of humans in a population. But systems do not have physical constraints. There is no law of nature that controls a price, especially with robot trading.
As it works out, if you look at the data and fit the curve to the data and not the other way around, you see that we have more power laws in the real world data. Growth tends to be modeled by f(t) = Aert where A is a constant, e is the constant, r is a rate and t is time. The Zipfian or Pareto distributions are another power law (the most frequent values are much more frequent than the rare values).

In a Gaussian distribution, an event at 100 standard deviations from the mean has a probability of ~1 in 10350 which is pretty much impossible. But with a power law distribution, it drops to a probability of ~1 in 108 which is still pretty high, but not anywhere near the Gaussian distribution.

Taleb’s current book is Antifragile: Things That Gain from Disorder (2012, ISBN: 978-400067824). You know what “fragile” means. It cannot handle shocks and is destroyed by them. Think of delicate glassware.

Resilience or robustness applies to things that keep going after shocks. The resilient things resist shocks and stay the same. This property is found in physical things as well as software. It repairs the damage from the shock or shakes it off. The goal is to return to the original state.

Robust is a little different; we do not get back to original state, but to the original goal of the system. My favorite definition of robust software was that is it like a horse versus a dog. Cut off a horse’s leg and you have to shoot it; cut off a dog’s leg, and he keeps running, but perhaps not as fast as before. This analogy is particularly appropriate to me as I look my cats, three-legged Artemis fighting with eye-less Toby, both unaware that they are doing the “cat stuff” we need in this house because they are robust.

But, there is no word for the exact opposite of fragile. Taleb calls it antifragile. This coined word describes things that benefit from shocks, volatility, randomness, disorder, and stressors. The antifragile gets better. I am going to leave with that thought, so you can play with it or buy the book.


submit to reddit

About Joe Celko

Joe is an Independent SQL and RDBMS Expert. He joined the ANSI X3H2 Database Standards Committee in 1987 and helped write the ANSI/ISO SQL-89 and SQL-92 standards. He is one of the top SQL experts in the world, writing over 700 articles primarily on SQL and database topics in the computer trade and academic press. The author of six books on databases and SQL, Joe also contributes his time as a speaker and instructor at universities, trade conferences and local user groups. Joe is now an independent contractor based in the Austin, TX area.