Eye on TDWI – August 2004

Gary Stratsos’ mellow new CD, “Quiet Fire”, had just finished playing on my Walkman as the plane touched down in San Diego. I was rested, focused, and ready to take on another TDWI conference.
The week promised a chance to catch up on some subjects that I had by-passed in previous conferences, attend some new course offerings, see old friends, and time to sneak in a sail on San Diego
Bay. You might say that the week held promise of something old, something new, something borrowed, and something blew.


Robert S. Seiner has presented his class “Building and Implementing Data Stewardship and Governance Programs” at past conferences and I had heard good things about it, so I signed up. Seiner has
developed a successful consulting practice around promoting these sorts of programs, and he had some great suggestions on how to create them in our own companies. He stresses a practical approach
where the program is formalized around those who are most likely to be involved with stewardship issues without getting credit for it. He also pointed out that data stewardship, which he defines as
the “formal accountability for the management of data resources,” is currently gaining importance with the advent of the Sarbanes-Oxley Act which holds execs accountable for the quality of
financial data their organizations publish. Seiner showed us we could develop a stewardship program based on minimal cost to the organization. He pointed out that we could sell the concept based on
recognition of work already being done. However, he did stress that we need to write such a program into policy and provide for the program structure, scope, metrics, roles & responsibilities,
conflict resolution, meta models, and change management.

The Data Warehousing Institute doesn’t waste any time getting down to business. Even though it was Sunday, we had our several night school options. I dropped in on Richard Hackathorn’s session,
“Information Visualization for BI and DW: What Works and Why?”

Hackathorn took us on a brief tour of visualization applications, provided an overview of the concepts, demonstrated a few products, and gave us several references. The night school format leaves
just enough time to sample a subject, and Hackathorn plans on expanding this class to a half day session for future TDWI conferences. He intends to include work done by a number of thought leaders
in the field of visualization. It should be a good class.


Monday morning the convention kicked into high gear with a keynote by Jaclyn Easton called “Business Intelligence Goes Mobile.” Easton is the author of Going Wireless: Transform Your Business
Using Mobile Technology. I didn’t see a strong connection between mobile communications and BI, and I wasn’t quite sure what to expect. Years ago I had seen a software salesman receive a report
on his Blackberry, but Easton had something different in mind. She quickly made the point that the technologies that facilitated M-Commerce (Mobile Commerce) were going to generate a whole lot of
data. She gave us an example with RFID which is based on tiny radio frequency id tags that can be sewn into clothing or otherwise embedded in different items. Easton told us how these tags can be
used to track and monitor an assortment of things that are only limited by your imagination. Regardless of what you think of the ethical challenges that this stuff might present, it does look like
we will have new BI data sources to work with in the near future that dwarf the volumes we work with today.

After the keynote, I headed for Tony Rathburn’s course titled simply, “Predictive Analytics.” Rathburn is a low key, straight ahead speaker with a strong set of messages for those involved in
data mining. He spent the first part of the morning reviewing the subject of data mining for both managers and practitioners. He went over some basic definitions and a set of potential benefits,
such as identifying risk, reasons for customer attrition, cross selling opportunities, and lifetime customer values. He also gave us a list of deadly sins for data mining: hype, lack of goals, no
data or data without information content. He also gave us a strong admonition that effective data mining was not something that could be achieved with the purchase of an off-the-shelf piece of
software. He cautioned that the way professionals prepare their data for training and validating their models is critical to their success, and then he got down and dirty with the subject matter.
As Rathburn went cruising around different techniques such as non-linear statistics, pattern recognition, clustering, and neural nets, the audience seemed to hang with him. I suspect that the class
was largely made up of statisticians. Actually, that’s how I got into the data game myself, and I got a lot out of Rathburn’s presentation.

Evan Levy’s night school class, “Beyond the Data Warehouse: Architectural Options for Data Integration,” sounded interesting so I dropped in. Levy is a great speaker and took us adroitly through
the issues of data integration surrounding initiatives such as data warehousing, ERP (enterprise resource planning), EAI (enterprise applications integration), and EII (enterprise information
integration) software. He and the audience really warmed up when he got to this last subject. You might say that EII is made up of data access middleware, a robust and accessible metadata layer,
and vision (both on the part of the vendor and the customer). The concept of EII has been around for over ten years with vendors such as IBM and Information Builders pushing it hard then (and now).
However, Levy’s presentation helped us create a vendor neutral vision of what this class of technologies can do for us. I look forward to seeing Levy explore this subject in more depth in a
daytime session at a future TDWI conference.

That evening, Cognos hosted a hospitality suite with a rather nice buffet, so I stopped by for a bite. The Cognos folks also did a demo and had one of their customers, Louis Barton of Frost Bank,
give a talk on his experiences. I was impressed with how well Frost had integrated Cognos’s three leading products, ReportNet, PowerPlay, and Metrics Manager. They made it very easy to move from a
specific report to being able to do analysis on the underlying data, and then view key performance indicators to see how the target of your analysis might fit with your overall organizational
performance. This integration works well in overcoming the limits of each class of technology and really does help the knowledge worker to focus on the data at hand.

After dining with Cognos, I moved on to Firstlogic’s suite for an after-dinner drink and a look at their suite of data quality tools, IQ8. I was particularly impressed by IQ8’s data profiling
capabilities, but perhaps that’s because profiling is a step that is often painfully overlooked in building a data warehouse. To profile or not is a classic pay me now or pay me later situation,
and I seem to keep running into pay me later problems. Firstlogic was also hosting the sax and keyboards duo of Bill Shreeve and John Giulino. They played some great jazz standards with a tasty
style. It made for a mellow night.


One thing that students at TDWI conferences get an opportunity to do is compare different approaches and methodologies that have been developed by various camps. You don’t have to be around this
business long before you recognize that there are methodologies that either reflect a “hub and spoke” centralized warehouse model or a combination of integrated data marts. The leading proponent
of the integrated data mart approach is Ralph Kimball. He and co-authors Margy Ross, Laura Reeves, and Warren Thornthwaite wrote the defining text on this subject, The Data Warehouse Lifecycle
I was fortunate enough to catch one of the co-authors of this book, Warren Thornthwaite, give an overview of the subject.

Thornthwaite’s “Data Warehouse Lifecycle Overview” covered a lot of ground, including data warehousing strategy, business requirements gathering techniques, dimensional modeling, architecture,
analytical applications, and deployment. Although that is a bunch to fit into a one-day session, it was well balanced and detailed where it needed to be. I was particularly impressed with how
Thornthwaite covered the major points of dimensional modeling in about an hour. He demystified the subject for traditional data architects by pointing out that the primary difference between 3rd
normal form data models and dimensional models is that the dimensions in a star schema are usually de-normalized. He also covered how to handle history in the dimensions and ended up that section
with a four step design process. Thornthwaite also covered ETL process development and how to manage system keys under the deployment and maintenance section. This was a great class. Every
professional in data warehousing should be familiar with the concepts that were covered here.

TDWI schedules an extended lunch break on Tuesdays and Wednesdays in order to give conference goers a chance to check out the latest vendor offerings. I went down to the exhibit hall to see what
was new. Metapa had an interesting data management product, CDB. Their technology uses Linux clusters running open source databases to provide support for high performance parallel queries across
multiple database machines. Metapa supports ODBC and JDBC clients which covers most BI tools on the market. Microsoft was showing off a new add-in for Excel which allows users to query Analysis
Services. This add-in was a straight-forward tool that Excel users will appreciate.


By Wednesday I was ready for a break so some friends and I checked out a sailboat and went out on San Diego Bay for the day. The had sun come out and burned off a layer of marine air. The wind
picked up, and the day turned out to be perfect for a great sail. There were lots of things to see on the bay. Three huge aircraft carriers were moored close by as were three old square riggers. We
passed by the start of a race between some pretty high end yachts, and as we approached the mouth of the Bay seals seemed to be everywhere. We, of course, talked business all the time.

After we got back in, I caught Sid Adelman’s night course, “The Knowledge Worker: The Missing Link in the DW’s Success.” In the early days of data warehousing, who used the data warehouse and
how they used it did not get a lot of attention from most data warehousing teams. Those groups often tried to limit the organization to one tool for data access. They did so in order to minimize
support costs. However, as Adelman pointed out that evening, we really should be paying closer attention to the different types of data warehouse users and their different needs. These knowledge
workers include non-technical business users to report developers (usually not in IT). The implication for us is that we should be thinking about a number of different delivery technologies
depending upon the business need. These include standard reporting, push technology (alerts), visualization tools, OLAP, data mining, and dashboards. Adelman left us with three key thoughts: 1) Be
sensitive to the business, the different needs of knowledge workers, and changes in the environment 2) Make your boss look good and 3) Make sure your activities are recognized.

Later that evening I wandered down to Seaport Village for a quick bite and turned in early (one of the rare times).


TDWI schedules the second keynote of the week on Thursday mornings. I look forward to it because it usually energizes me for the final two days of the conference. This time I was especially looking
forward to the keynote because I had heard Howard Spielman present before, and he puts on a great show. His keynote this morning was titled “Visualizing the BI of the Future: Building on the BI of
the Past”. Spielman is an expert on data visualization and he used his knowledge to craft a fascinating presentation. He covered the history of visualization from the time of King George III to
the real-time data presentation support in the 777 cockpit. He finished up by making the point that the pilots of industry are also demanding real-time access to data and that we should embrace the
paradigm shift to real-time data visualization through enterprise graphicacy standards and well designed, enterprise BI cockpits.

The field of data warehousing continues to evolve. The Data Warehousing Institute has added the statement, “The Premier Association for Business Intelligence & Data Warehousing
Professionals.” An increasing number of sessions at TDWI conferences are addressing the issues of data delivery and the importance of business alignment. The instructors are more inclined to talk
about business intelligence as a business oriented wrapper for the data warehouse. One of the more comprehensive courses in this vein that I have come across was “The BI Pathway Method,” which
was delivered by James Thomann and Nancy Williams.

The BI Pathway Methodology is based on identifying business value opportunities up front. This goes beyond finding a seed project with vague business value upon which to justify a data warehouse.
The BI Pathway Methodology suggests that data warehousing teams take a much more business-oriented approach in aggressively driving out business payoffs, build to those targets, and then monitor
not just data warehouse, but the success of the business. It includes the idea of wrapping the data warehouse architecture in a business architecture with clearly defined deliverables. Thomann and
Williams did walk us through a proven multi-tier data warehouse architecture that included staging, distribution, and delivery layers. However the new value add for this course was clearly the
techniques that they proposed to insure the success of a BI initiative.

That night I headed for Gas Town for dinner with Michael Gonzales and his “hands-on” crew. They had just delivered an update to their advanced analytics lab. I had sat in on this lab in Boston
three months ago. However, given the changes in analytical software and Michael’s aggressiveness in keeping his course updated, his San Diego lab sounded like a new offering. I was sorry that I
missed it this time. He said that the class did a lot of work with spatial analysis. Fortunately TDWI has this lab scheduled for its up-coming Orlando conference.


In the position of clean-up hitter, Lisa Loftis and Joyce Norris-Montanari appropriately delivered a session on “Data Profiling: Understanding Before Transforming.” Loftis and Norris-Montanari
make a dynamic duo, which I appreciated on the last day of the conference. They defined data profiling as the diagnostic step in ensuring data quality. Data profiling is the process of
“discovering the characteristics of the data in an organization – is it valuable and is it usable.” They pointed out that profiling addresses such questions of uniqueness, the distribution of
data values, and structure. After giving us an excellent overview of the issues and how profiling fits with other data warehousing processes, Loftis and Norris-Montanari gave us a brief run-down on
what to look for in profiling technology.

The week went quickly and before I knew it, I was on the plane back home and planning for Orlando in November. Until next time…


submit to reddit

About Tim Feetham

Tim is an independent consultant who specializes in data warehousing for small to medium sized businesses. He has worked in sectors ranging from travel, health care, finance and software, to higher education. He helped design the Data Resource Management Certificate at the University of Washington and has taught in that program for more than 10 years. Feetham is also a former senior research analyst for TDWI. He continues to contribute to TDWI publications and events.