Eye on TDWI – Baltimore – May 2005

The plane ride out from Seattle was as all such travel should be, smooth and uneventful. As we neared Baltimore, home of Ft. McHenry of Star Spangled Banner fame, I was mellowing out to Branford
Marsalis’s “Eternal”, and thinking about what this TDWI held in store. I had planned to catch up with the state of ETL technology and check out the progress in integrating unstructured data with
the data warehouse. Given the proximity of Baltimore to Washington D.C., TDWI had scheduled a couple of interesting looking sessions focused on data warehousing and business intelligence in the
government sector. Of course, there were the inevitable nuggets that make these trips so great.


After checking into my room at the Waterfront Marriott, I went down to visit the Informatica Hospitality Suite. The food and drink were top notch, and the program was quite engaging. Gary Reicher,
from T. Rowe Price, talked about the Integration Competency Center. This is a strategy that Informatica has been championing recently and T. Rowe Price provided a good example of how this concept
can work for a company.

Organizations using the Integration Competency Center, ICC, approach can leverage the experience they gain from data integration undertakings such as building data warehouses and apply those
experiences across multiple data integration projects such as in ERP deployments. Data integration consumes the lion’s share of resources in a data warehouse program both in the cost of technology
and the development of a competent data integration team. By pushing those costs out to the ICC, the enterprise can amortize them over a range of data integration projects. Forrester Research
advocates this approach in order to more ably show how data integration technology can have strong total economic impact.

We finished off the evening with a Champaign toast to a new book, Integration Competency Center by John Schmidt and David Lyle. Being the geek that I am, I thought it might make for a good bed-time
read. I bought a copy on the spot. It turned out to be well written and had some good observations on the progress of scientific thought.


TDWI shifted into high gear on Monday morning with Dan Merriman’s keynote titled, “Driving the Integration of CPM and ROI”. The subject of the business/IT divide, as it relates to BI and data
warehousing, has shown up in the press and past TDWI keynoters, and I’m sure that most of us in the business can tell our own stories about dysfunctional organizations that reflect this gap.
However, Merriman gave us some good ideas on how to bridge the divide.

Merriman noted that we are collecting data at an increasing rate. This puts the data warehousing and BI professional in an ideal position to help their organizations realize the value from this
opportunity, and avoid having these data stores overwhelm them. He suggested that BI/DW project leads should link project ROI to Balanced Scorecard metrics. He pointed out that BI/DW professionals
are usually focused on ROI for their individual projects or maybe the DW at best. However, DW and BI professionals are in an excellent position to take a more global view and to help business
management identify key metrics for management scorecarding throughout the organization.

Merriman suggests that DW and BI professionals become involved with scorecarding because they are likely to have insight as to the information needed for effective scorecard programs. However, they
will likely have to take the initiative with senior management, since IT does not have a strong history of providing this sort of consultative support.

After the keynote, my fellow conference goers and I headed off for our different sessions. Michael Gonzales’s ever popular Hand-On ETL has been tough to get into at past conferences, but I had
signed up early this time, so I was able to get a seat. Gonzales’s Hands-On classes always include background on the class of technology at hand plus a good review of the vendor space. However,
Gonzales does caution that his sessions are not bake-offs between competing tools, but simply designed to give the student familiarity with representative technology.

Over the course of the day we learned about ETL terms, the range of ETL functionality, architectural strategies in the data transformation space, industry trends, and Gonzales’s forecasts. We also
got hands-on experience Microsoft’s DTS, Syncsort’s DMExpress, Oracle’s Warehouse Builder, IBM’s WebSphere DataStage (formerly known as Ascential DatatStage), and SAS’s ETL Studio. Our
exercises covered a wide range of functionality from joins where we assigned keys based on different sources to using different techniques to sort out duplicates. Although a day was not enough time
to really dive into each tool, I did get a feel for each one of the interfaces and the target markets for each tool.

That evening I dropped in on the IBM hospitality suite. They were showcasing one of their partners, Siebel Systems, Inc. Siebel has built a reputation on their customer relationship management
technology. However, this night, they gave us an overview of Siebel Business Analytics which consist of pre-built dashboards that run in real time on data integration platforms. Although they
include a wide range of standard reports, Siebel also includes a capability that allows the user to explore problem areas in their businesses with interactive analysis. Although Siebel would be
happy to sell this package as an alternative to the in-house developed data warehouse, I can see situations where organizations may want to incorporate analytical applications such as Siebel’s
along with their data warehouses and other BI tools. This will be especially true as these organizations realize the increasing value of BI and data warehousing.


TDWI showcases a number of analyst viewpoints and case studies in their Tuesday BI Strategies sessions. After Siebel’s presentation the night before, I decided to catch a session presented by Mike
Masciandaro, Business Intelligence Director for Rohm and Hass.

Masciandaro related how Rohm and Hass had implemented SAP across the board and migrated their home-grown data warehouse to SAP BW, which is the pre-packaged data warehouse that SAP offers. He told
us that, before this migration, the heterogeneity of their different transactions systems had resulted in serious on-going data quality issues. SAP BW provided a massive improvement in data quality
for them, and he seemed to be pretty happy with the results. Rohm and Hass also experienced significantly reduced support costs. He did say that the few systems that remained outside of the SAP
umbrella continued to offer integration challenges.

The TDWI vendor show opens up on Tuesday at each conference, and this time I wanted to check out a few products that offered something a little different.

Endeca was showing off its new product, Latitude. Latitude is an interactive reporting product that runs on top of Endeca’s search capabilities for structured and unstructured data. Later in the
week I was planning to catch a session on BI and unstructured data so this was a find.

FairIsaac had a both and curiosity got the best of me so I stopped to chat. I had always associated FairIsaac with their credit scoring algorithms, but they have branched into what they call
Enterprise Decision Management or EDM. EDM is based on a business rules management system and predictive analytics that extend BI into the operational side of the house. This sort of technology
will need real-time data supported by historical data and it should help to make real-time data warehousing pay off handsomely.

Given that I am interested in seeing small to medium sized companies adopt some of the BI/DW initiatives that have proved successful with larger firms, I am always on the look-out for vendors that
target this market. However, the leaders in the BI tool suite space have all but ignored the small to medium sized business and there are few reporting tools out there that are affordable for the
smaller company. Viador is an exception. They have been have had self-service reporting and portal products on the market for several years and have just released a pure DHTML product called
ReportOne/Analyzer. The demo looked good, their price point was reasonable, and I concluded that these guys should be on any small to medium sized business’s short list of candidate reporting

That afternoon I took a break and went over to the National Aquarium in Baltimore. The place had a wonderful assortment of fish from around the world including a serious shark tank. However, the
thing that impressed me the most were the number different types of brightly colored poisonous frogs that were on display. I knew that there had to be a lesson in all of this, but I didn’t have
the inclination to figure it out. I chose instead to just enjoy the place and all that it had to offer.


TDWI usually sponsors sessions targeted at particular sectors of the economy such as education, health care, etc. TDWI’s purpose behind these sessions is to address the specific needs of an
industry and to give those working in that industry a chance to meet others with similar challenges. This conference included a track for government agencies.

Steve Williams kicked off the first session on BI and the government with his class titled “BI and the President’s Management Agenda: Driving Performance Improvement”.

It looked pretty interesting so I dropped in.

Williams pointed out that we have the first President in history to have an MBA, and many of the management objectives that agencies need to now meet include things like increased emphasis on
performance management, more timely and reliable financial reporting, accountability, and integrated cost and performance data. Williams then went on to introduce the Government-Centric BI/DW
Development Method. He gave a number of examples of agencies that have used this method.


R. Todd Stephens kicked off the second keynote of the conference. He titled it, “An Information Odyssey: The Future of Business Intelligence.” I have heard several analysts and consultants talk
about the future of BI in the past. They had focused on the challenges of really big data, real-time delivery, and outsourcing so I wasn’t expecting much new. However, Stephens was the Director of
Metadata Services for BellSouth. As somebody “on the ground”, he did bring an unique perspective to the subject.

Stephens started out by reviewing the work of other futurists. He quoted Peter Drucker’s “Ages of Civilization” which points out that the information age may be already drawing to a close and
that we are on the brink of the age of knowledge. He then extended Drucker’s ideas to the BI/ data warehousing arena. Stephens suggested that the hub-and-spoke model and other current data
warehousing architectures are artifacts of the information age. In the age of knowledge, such frameworks would give way to service oriented architectures. I thought about Sunday night’s
hospitality suite hosted by Informatica and Gary Reicher’s presentation on integration competency centers. Stephens had a good point.

Stephens then went on to confirm that integration is at the heart of a services oriented architecture. He suggested that integration will be needed for data as well as governance and various
business and technical initiatives. He also saw BI as being embedded within the fabric of service oriented architectures. Like other futurists, Stephens saw technical jobs going to other economies.
At home, successful BI workers will be knowledge workers. They will work within a framework where they will be treated not as employees, but service providers and, companies will become to be known
as organizers not employers. Hmmm.

With all that, I went down the hall to catch Nancy William’s session on “BI issues in Government”. Some of the challenges to BI initiatives involve integrating data from different groups that
are resistant to share. Government timeframes are often so long that expectations are low. BI initiatives also play second fiddle to other IT initiatives and agencies are slow to adopt latest
technologies. Personal incentives are often missing. Measuring mission success is also difficult when profit is not in the picture. However, she pointed out that the emphasis on performance
management is making BI more important. She had done a survey of 12 government agencies and 97% of her respondents said that BI/ data warehousing were important or critical to their organizations.

In spite of the title of the session, plus the complaints about the lack of inertia and a dearth of risk taking in government that my fellow classmates voiced, I did see a number of parallels
between the challenges of government BI initiatives and those taken on by large, bureaucratic corporations. Both are subject to cultural roadblocks and need to be periodically refocused on their

Later I dropped in on Mike Lampa’s night school session, “Data Warehouse Health Check.” Lampa briefly went over the characteristics of a healthy data warehousing initiative, one with problems,
an assessment exercise and review, and suggestions for improving the health of the data warehouse. He pointed out that a healthy data warehouse is built upon a solid architecture, designed to
anticipate change, and goes through regular checkups for architectural alignment with the business. Questionable data quality and long delivery times are also symptoms of an ailing data warehouse.
Lampa also presented a 360 Degree Data Warehouse architecture that incorporated elements of a service oriented architecture. It was a good session, based on good fundamentals. Hopefully it will be
expanded at future TDWI conferences.


I decided to wrap up the week with David Grossman’s session on “Future Trends: Integrating Structured and Unstructured Data.” This is a subject that the “father of data warehousing,” Bill
Inmon started to promote a couple of years ago. Others have also added their support to the thinking that it’s about time to draw these two types of data together, but Grossman took us to the next
step. He reviewed the challenge of providing a single correlated query against data stored in different formats including tabular, text, image, audio, and video. He also talked about the efficacy
of current search engines and explained how retrieved links don’t often come close to showing all relevant documents and usually bring back a lot of links that are not relevant. He then turned our
attention to the issue of bringing about 80% of the world’s data, which is unstructured, into the data warehouse.

Grossman reviewed five design options for making unstructured data available to warehouse users.

The first option involves moving all available unstructured data to the data warehouse. This does have some drawbacks besides the obvious task of migrating such a large volume of data and keeping
it fresh. Users will need to have tools that support SQL extensions in order to retrieve unstructured along with structured data. There are some technologies that support this, but performance is
not great.

The second option is to store text indexes in the warehouse. This avoids the need to migrate large amounts of data, but it does require a “‘crawl’ of all of the text to build the index.”

The third option involves storing items of interest in the warehouse. These items would be identified, or “tagged” through text mining techniques. However, this technique requires the
identification of relationships and the technology is just developing in this area.

Grossman talked about portals as a fourth option, but pointed out that portals are often used to “dress-up” stovepipe data stores that are not well integrated.

Enterprise Information Integration (EII) was Grossman’s fifth option. This option supports federated queries and depends upon good metadata. In spite of the fact that EII is somewhat reminiscent
of the virtual data warehouse which had several conceptual problems, he did position this last option as having potential for the future.

After the conference wrapped up, I took my son and his wife to see Najee at Blues Alley in D.C. Najee had a truly great band including Alvin White on guitar, Nathan Wilkie on keyboards, Kentric
Morris on drums, and Rohan Reid on bass. The group had the crowd goin’ big time, and as if that wasn’t enough, Najee introduced an on-the-rise singer, Sisaundra, who brought the house down. What
a way to end up the week.

See you in San Diego for TDWI’s next conference coming up in August. San Diego is a great town and the line-up for the conference looks great.


About Tim Feetham

Tim is an independent consultant who specializes in data warehousing for small to medium sized businesses. He has worked in sectors ranging from travel, health care, finance and software, to higher education. He helped design the Data Resource Management Certificate at the University of Washington and has taught in that program for more than 10 years. Feetham is also a former senior research analyst for TDWI. He continues to contribute to TDWI publications and events.