SELECT * FROM Celko – March 2013

Dennis McCafferty wrote an article in CIO Insight this month (2013 Feb). It was based on a survey among 300+ IT departments on Big Data projects. What was surprising to me was the lack of surprises.

In the survey, 85% of respondents say their data footprint is larger than one terabyte. I do not know how many of the respondents are in the petabyte range in the sample, but a terabyte is not that big today. I can buy a drive that size from a catalog for under $200.

A Dilbert cartoon this year has the pointy-haired boss informing Dilbert that “Big Data lives in the clouds and watches over us,” probably because he read an article in a magazine on an airplane. The IT versus business management problem is not new, either. “Often, those charged with implementation are the last consulted,” says Jim Kaskade, CEO of Infochimps, which helps companies deploy Big Data environments in public, private and virtual private clouds. “CIOs need more insight into the too-often overlooked views of those charged with the heavy lifting, and companies need to start with the business problem first to properly scope their projects.”

The other unsurprising point was that 55% of Big Data projects do not get completed. Any of us old timers will remember the early days of the structured programming revolution and the roots of software engineering. We did surveys back then, too. This has been the way things work in this trade for decades!

The classic source for IT failure rates is the “Standish Chaos Reports” from the Standish group. They define success as projects on budget and with expected functionality. It has been updated for decades and has over 40,000 projects in its files. Project failures have declined to 15% of all projects in 2006 from a 31% failure rate in 1994.

The CIO Insight survey came back with 80% of enterprises saying that finding the right talent for Big Data projects is hard. The same thing was true for SQL programmers when it was the new toy in the nursery. If you cannot hire this talent, then you have to train your people. In the survey, 73% of the respondents say understanding the platforms required for these projects can be difficult. And yet, 66% of respondents prefer to build Big Data projects in-house!

Google “software project failure rates” and read a few of the articles. I liked with the title “62% of IT projects fail” from 2008. Of course you have to define “failure” in some measurable terms. Sure, we have budget overruns and higher-than-expected maintenance costs that are part of our folklore. We also have projects that deliver something useless or inadequate. Currently about 25% of all IT projects get canceled before completion, so they delivered nothing at all.

This is not necessarily bad. If an off-the-shelf product appears on the market and makes your project redundant, then it might be a wise move to cancel it. You save the rest of the project budget, do not have to hire support staff after deployment and get a user community.

The business side of the house is easily impressed with demos and slide shows and lacks an understanding of the technology. The sin of the IT side of the house is a love of new toys and feature creep without regard to a business reason. I think the problem is when both sides talk themselves into a project.

A classic example was Greyhound’s disastrous TRIPS computerized reservation system in 1991. The idea was that a computerized reservation system could let them make more efficient use of buses and drivers and that it would improve customer service. Sounds good.

The first problem is that most bus passengers do not reserve seats in advance. They show up at the terminal, buy a ticket (often for cash), and take the next bus. Few buses even reserved seats. Back in those days, the customers would call for schedule information, not for reservations like the airline industry. But the financial people liked TRIPS and this gave Greyhound the ability to borrow funds and raise capital.

The TRIPS project begins with a staff of forty or so and a $6 million budget. You might want to compare that team with the SABRE airline reservation project. SABRE was orders of magnitude larger in both staffing and budget.

Nobody thought that bus reservations are more complex than airline reservations. A passenger might make one or two stops on an airline flight and cross the United States with one to two stops only, whereas bus passengers may make ten or more stops on a trip, and a cross-country trip might involve scores of stops. Greyhound technicians estimated that a bus management system would need to manage ten times the number of vehicle stops per day of an airline vehicle management system. The average bus passenger did not have a credit card back then for ticket purchases by telephone.

Greyhound management publicly promised to launch the system in time for the 1993 summer busy season. Unfortunately, ticket clerks required forty hours of training to learn to use it. The database was not complete, and they had to switch screens to add schedule information from the old books. Clerk time to issue tickets doubled when they used the system, if it did not crash.

In spite of failures, TRIPS was rolled out in May 1993 with the failed version so they could meet management promises. TRIPS would freeze randomly when the system got fifty terminals. The toll-free number telephone system in Dallas sometimes took as long as 45 seconds to respond to just a single keystroke and could take up to five minutes to print a ticket.

TRIPS finally had to be thrown out, but not before almost killing the company. Nobody would back down. Management wanted to show off, and IT knew that they just needed a little more time, a little more budget.

Today, Greyhound’s online ticketing is doing just fine. People have credit or debit cards today. The public is used to online shopping for everything. And we are much better at building databases.

One of the possible reasons cited for the slight decline in failure rates from the Standish group was the sizes of the projects were getting smaller as we moved to smaller platforms. But Big Data projects are bigger by nature, aren’t they? Could that be part of the return to higher failure rates? Just asking.



submit to reddit

About Joe Celko

Joe is an Independent SQL and RDBMS Expert. He joined the ANSI X3H2 Database Standards Committee in 1987 and helped write the ANSI/ISO SQL-89 and SQL-92 standards. He is one of the top SQL experts in the world, writing over 700 articles primarily on SQL and database topics in the computer trade and academic press. The author of six books on databases and SQL, Joe also contributes his time as a speaker and instructor at universities, trade conferences and local user groups. Joe is now an independent contractor based in the Austin, TX area.