After a lazy Saturday afternoon drifting between tasting rooms in wine country, I took on The Data Warehousing Institute’s 2003 World Conference full force. Anybody who has been to a TDWI
conference knows that there is a lot of content packed into six days. Given a raft of concurrent sessions, no one person can hope to cover the whole thing, but that’s what makes coming back that
much more interesting. This was my eighth TDWI conference and I still got a lot out of the week.
Sunday
After checking in and getting my badge, I headed off to Jim Schardt’s advanced two day class on requirements gathering. Schardt’s background is in computer science and aerospace. I thought that
he might provide some new and interesting perspectives. Although I have been in the business for, um, a long time, I am always open to learning new techniques in this area. Schardt did not
disappoint.
Schardt gave the class some good interviewing techniques and then led us through an engaging exercise centered on defining sound business objectives. He went on to suggest that our conceptual
modeling efforts would be more effective using object oriented techniques. I had dismissed OO for data warehousing in the past. I had assumed that narrowly defined use cases would lead to an overly
restrictive data warehouse design. However, Schardt suggested that UML (universal modeling language) was a more natural and straight forward way of developing conceptual models. He focused on four
general use cases, ‘answer ad hoc questions’, ‘answer questions with a report’, ‘monitor business process’, and ‘discover business patterns’. These use cases reflect a categorization that
should be familiar to most of us involved in data warehousing, and Schardt’s application of these categories as use cases allowed me to get over my reservations about OO. Schardt also led us
through an exercise where we applied reverse engineering techniques to our UML models. He went on to show how we could use our models as sources for our dimensional designs, which turned out to be
a nice lead-in to Laura Reeve’s upper level class on dimensional modeling, which was on my schedule for later in the week.
After Schardt’s class let out, I checked out a night school session on the ‘Secrets of Predictive Analysis’ presented by Robert Cooley. Night school classes are about 90 minutes long and Cooley
wasted no time in getting into his subject. Predictive analysis is really what data mining is all about. He pointed out that business analysts use OLAP tools to answer questions based on models
that include the key determinant factors for given business outcomes, whereas data analysts use data mining techniques to determine what those models should look like in the first place. Cooley
claimed that predictive analytics require small investments and can reap large rewards. He also showed how predictive analytics could extend the reach of metrics management and OLAP activities.
It’s too bad that many shops with data warehousing programs do not leverage the data they manage through active data mining efforts.
After a brief stop at the Informatica opening night reception, and twelve hours on the go, I checked into my room. This was going to be an intense week.
Monday
Good conferences have good keynotes and this was no exception. This conference had two. Wayne Eckerson reviewed an ETL (extract, transform, & load) study that he recently did with Collin White,
and Jonathan Geiger and Debbie Froelich presented practical suggestions for data warehouse program management later on in the week.
The Eckerson/White ETL TDWI study was quite extensive and was arguably the best one on the subject in recent years. I was interested to see if they could update a long held perception in the
industry that most data warehousing programs are built on custom code. Their survey of over 700 organizations, both large and small, found that only 18% currently rely solely on home grown ETL. The
rest either depend completely upon ETL tools or deploy a mix of ETL tools and custom code. In addition, ETL tool customers were more satisfied with their strategies. Eckerson pointed out that even
though a new ETL purchase might require up to 12 months for developers to gain proficiency, once done, the organizations that took this route gained a five to six times advantage over those that
stuck with custom code. Eckerson also pointed out additional gains in administration for those who adopted ETL tools.
After the keynote I went off to the second day of Schardt’s class on requirements gathering and then on to a second go-’round of night school.
Getting vendors to adhere to a single meta data standard has been problematic since no one industry wide standard existed. That was true until several years ago, when the two most active standards
groups joined forces under the Object Management Group and produced the Common Warehouse Metamodel, CWM. Monday night Mark Riggle talked about the impact of CWM. CWM promises to be the primary key
in facilitating communications between different data warehousing technologies. Often, when some new standard comes out, users put their vendors through the ringer to support this next greatest
thing, and the standard goes nowhere. However, Riggle made a convincing case for the staying power of CWM. He also mentioned that the CWM is based on a UML model. OO pops up again.
Tuesday
No keynote today, but TDWI put on a spectacular breakfast on the 46th floor of the Hilton. The view of the city was great and my fellow conference goers provided stimulating company. After drinking
in the atmosphere for an hour, I dropped in on TDWI’s Business Intelligence Strategies session.
Unlike regular classes, six different instructors presented over the course of the day. The overall focus of the session was on Web services. I was particularly interested in the new standard for
OLAP over the Web, XMLA. With over 30 vendors signed on to support XMLA, it should make a number of requests for information check lists. I was also interested in seeing how many vendors sent their
folks to regular sessions at TDWI. This particular class was good for software designers who need to stay ahead of the curve, but I also encountered classmates from PeopleSoft, Informatica, and
Microsoft at other sessions during the week.
Tuesday and Wednesday are vendor days at TDWI conferences. Long lunch hours and an extended evening session with serious hors d’oeuvres gave me a chance to catch up what’s new on the technical
front. Microsoft demonstrated its new Reporting Services; Cognos provided a tour of its interface to IBM’s UDB OLAP features; and I talked with the Embarcadero Product Manager for their DT/Studio,
a surprisingly robust ETL tool for the money.
After three intense days of classes, I usually get bushed, but this time I went on the offensive. I figured that with a brew or two and some good jazz, my body wouldn’t know that it was wiped out.
I had stumbled across Johnny Foley’s Irish House when I was out running an errand. There was a sign in the window:
EVERY TUESDAY NIGHT!
FIL LORENZ & THE COLLECTIVE WEST
JAZZ ORCHESTRA
So I dropped in. What a show! I was ready for the rest of the week.
Wednesday
Another spectacular breakfast on the 46th floor and then on to a town meeting with Claudia Imhoff and Laura Reeves… This session focused on the differences and similarities of the Corporate
Information Factory (CIF) and Data Warehouse Bus Architecture (DWBA). The key difference between these two frameworks was that the CIF encompassed three tiers — staging, the enterprise data
warehouse (EDW) in 3rd normal form, and various stripes of dimensional data marts. The role of the EDW was to maintain a integrated and comprehensive history of all of the data elements in the
warehouse. The primary user access points for the CIF was through the data marts. The DWBA, on the other hand, was a two tier structure with a staging area and a series of integrated dimensional
data marts. The emphasis in the DWBA design was that everything beyond staging should be accessible to your users. Integration and history were essential elements of the bus. TDWI billed this as a
bake-off between top down and bottom up architectures, but as both Reeves and Imhoff pointed out, those descriptions are getting a little shop worn. Both Imhoff and Reeves stressed the importance
of the incremental approach to data warehouse building. My guess is that what TDWI wanted to really name this session was ‘Clash of the Titans, Inmon vs. Kimball’, but chickened out in the final
cut. Oh well, it was a fun session.
Wednesday night a number of vendors rolled out hospitality suites. Some of the vendors offered conference goers a chance to talk with their development staffs, others made presentations of their
latest and greatest achievements, and yet others simply provided drinks, hors d’oeuvres, and a chance to network with fellow conference goers.
Thursday
Jonathan Geiger led off the second keynote with a review of program management principles and lessons learned in the field. Data warehousing projects need the framework of a sound program that has
a long-term focus and a high level mandate to integrate data and consolidate decision support technologies. Debbie Froelich went on to relate how Mattel Toys applied program management principles
to their data warehousing effort. She pointed out that active executive sponsorship as well as support is critical, and that unifying local decision support systems was a challenge. This last part
is tricky. I have seen more than one data warehousing effort run into trouble due to the politics of technology.
After the keynote, I hit Laura Reeves’s upper division class on techniques for dimensional modeling. Reeves was a coauthor with Ralph Kimball, Margy Ross, and Warren Thornwaite for The Data
Warehouse Lifecycle Toolkit, so I was especially interested in this class. I was looking for some elegant solutions to problems that I had encountered in the past. Reeves did provide some good
solutions to design challenges such as unbalanced hierarchies, but she also offered brute force techniques that we could use when we faced tool limitations. Sooner or later most of us will have to
create numerous work-arounds when we are faced with a physical database structure that was generated straight away from an elegant but complex logical model. For example, some data access tools can
handle snowflake schemas and bridge tables, but others cannot. Reeves was refreshingly pragmatic in her approach to database design and her students benefited from it.
By Thursday night I noticed my energy flagging so once again I had to fool my poor body. I headed down to Lou’s Pier 47 with a few friends for a little JUNIOR DEVILLE playin’ the blues. Mission
accomplished.
Friday
Whoa doggie! Three aspirin, a double tall latte, and I am ready to take on meta data one more time.
David Gleason’s class, ‘One Thing at a Time – An Evolutionary Approach to Meta Data Management’ made a strong case for paying attention to meta data, but not with a ‘big bang’ approach. He
made sense. Data warehousing suffered from the ‘everything in one delivery’ strategy in its early days. Since then we have learned the importance of business driven incremental development.
Gleason took this approach one step further and adjusted it for meta data management. He wrapped this concept in a raft of good suggestions and made even the last day of this conference pretty darn
valuable.
Home again.
That was a great conference, but I don’t think I can fool my bod’ one more time. Until next time…