New technologies bring new opportunities, not just to developers, but also to those of us in the data management profession. Over the past decade, many traditional data management organizations
became smaller in size and budget as development groups grow larger. While it is true that attention to web pages, eCommerce, ERP/packaged applications, and Java lead to an increased shift towards
developers, I feel a broader reason is because data professionals do not track technology developments the way other technologists do. As a result many of us loose valuable opportunities.
A good example of a lost opportunity is with the development of XML. In the late 1990s, XML was heavily promoted for being application independent and self-describing. It is more flexible than flat
files or EDI, as it can contain more information, such as data types or data validation rules in a self-contained file. Increasing numbers of software groups use XML to package their APIs or save
their data in XML formats. Still, as of 2006, most data management professionals (based on interactions through DAMA meetings) still are not involved in nor could strongly influence XML schema
design, review, and deployment. The problem is compounded when many data modeling vendors (save Sybase with PowerDesigner) did not provide XML design capabilities as a way to represent data.
There are two new technologies gaining increased momentum that pose new opportunities to reassert data management’s core strengths and value to the business. The technologies are Service Oriented
Architecture (SOA)-driven integration technologies (also known as Enterprise Service Bus) and Enterprise Information Integration (EII). Both of these technologies rely on both the availability of a
common information model that is application-independent, and analysis of data in the information sources to come up with standards to represent the information content, not just the structure.
Data Management and Services Oriented Architecture through XML
SOA relies on web services as a way of interfacing with data sources. Different services can be called against different data sources to get the user what they want without creating separate
applications or additional redundant data sources. If your provisioning and billing systems are different, and you want to create a web application to see if customers qualify for your services
before starting the ordering process, CheckServiceAvailability, CheckCreditScore, OrderService, CheckExistingCustomer type calls can be distributed across multiple applications and enable them to
do what they were designed to do.
That said, web services can be written as well or as poorly as the designer’s effort. However, XML makes these interfaces more flexible and forgiving. Many development organizations that promoted
the benefits of SOA to their management are struggling with how to create canonical models to deliver on the promise of the technology. Even in these collaborative environments where data
architecture and development resources work together, the challenge continues to be number and skill set of data resources that can support such efforts and the concomitant ongoing
responsibilities. In these organizations as well as in many consulting companies, there is increased demand for XML and integration-savvy data architecture resources to design or at least assess
canonical models, so what is deployed is in fact application-agnostic and flexible.
So why is good data architecture important to web services and what are our challenges? To use the architecture analogy, to have reliable applications with robust performance and low TCO (total
cost ownership), you need robust composite and canonical services. Emphasizing low TCO is important, since once you define canonical services, the enterprise should ideally build upon and expand
them and not create additional similar-sounding services for different applications. This can very easily lower the promised value of SOA, and gain data resources support by management to help
choose the wiser course.
Before you can define a GetAccount service, you need to know what an Account is, how it relates to a Customer or Vendor (or shall we say Party Role), and its key attributes. Also, XML is a
hierarchical modeling approach as opposed to a bi-directional graph like an ER-model. So it is perfectly appropriate to a have two different models. A CustomerOrders xml specification can list
Customer then Order information in a reporting/data integration type scenario. An Orders specification can list Order then Customer information and be a more effective archiving approach. XML’s
nature provides this flexibility while requiring careful thinking to balance redundancy vs. security and performance.
What does this mean to the data professionals? If you are already combining XML with data modeling to support your company’s SOA,initiatives, I invite you to share what you have learned with the
broader community. If you are not involved in XML design, however, pick up a book, talk to the developers to understand their challenges, and start applying your modeling, domain and reference
value definition, and data analysis skills towards helping address your companies’ XML problems.
Data Management and Enterprise Information Integration
If XML is not an area you want to get into, there is another technology that is increasing its penetration which will need the assistance of data architects. EII (Enterprise Information
Integration) promises to cut down on the development time in delivering operational reporting solutions when data is sourced from multiple applications, regardless of the technology of the data
sources (different relational databases, flat files, web services, …). This is accomplished by implementing an application-independent data model in the EII tool, and specifying how data in
different source systems would map to it.
EII engines’ ability to optimize a query across applications without a physical data store promises to both lower development time and lower costs compared to traditional data marts, which were
used for integrated reporting as opposed to true analytics. While the technology is still in early stages of deployment, it has gained enough attention that even larger ERP and reporting vendors
are working on solutions. The challenge continues to be who can provide these models in a timely manner to deliver on the value of the technology.
Logical data models are excellent starting points for your companies EII development. Furthermore, by providing common design patterns for cross-referencing and hierarchical data management, you
can provide a common view to the end users, regardless of the data sources the information may come from. While there is a performance-tuning aspect of the model as well, this is mostly handled by
the engine. Additional considerations will involve specifying data access security roles and rules.
Summary, Lessons, and Recommendations
XML, ESB, and EII rely on the ability to develop an implementation model that is application-agnostic, which is often based on a logical data model. To verify the model and to increase the quality
and speed of system integration, data standards and profiling capabilities will be of significant benefit. These are core data management strengths and we can make a difference.
Information and knowledge management is re-gaining its importance and it will be up to us to meet the need and the challenge. We need to keep up with new technologies, anticipate challenges we can
help meet, identify existing capabilities (data models, profiling) that can help support the new efforts that has the management attention, and step up to the plate to offer our support. When we
show our benefit and relevance, we can make the case for additional training and headcount and start reestablishing our presence in supporting the business.