Approaches to Enterprise Data ArchitectureRecently I (DCR) attended Enterprise Data World, looking for approaches to establish data governance in an organization with decentralized control. I attended as many sessions on data governance as I could find. The approaches used fell into two main categories:
- Top-down: get the support of top management; then, with their backing, set up governance and simply tell everyone that they must participate.
- Peer-based: a variety of approaches other than top-down were discussed, such as creating a community of practice, providing services to projects, and other approaches.
Several sessions reported on success using the top-down approach. However, these seemed to be primarily from organizations that were limited in scope, with very centralized management around that limited scope. At these sessions, it was not unusual to hear questions from the audience about what to do if you can’t get top management support, from attendees who described unsuccessful efforts over years to obtain such support.
I also attended a small number of sessions reporting success with what I call peer-driven approaches, based on adding value to other business processes through data architecture, and using this added value to obtain enterprise-wide support. The slogan “I’m from data architecture and I’m here to help you” was even heard at one presentation. The ideas presented here are an adaptation and extension of presentations at the recent conference. Unfortunately, my notes don’t permit detailed attribution, although I do want to acknowledge the contributions of Jaime Fitzgerald1.
The observations from the conference prompted a reexamination of the approach to data governance in our own decentralized organization. Together, a colleague of mine (Kay) and I decided that instead of expressing our goals for enterprise data architecture in lofty enterprise terms we could express them in simpler terms that would appeal to managers of software development projects, and that we could present our work as a service to development projects. This article reports on that new formulation, which has been well received.
Data Architecture as a Service
Synthesizing the approaches described for peer-based data architecture, we call our approach DAaaS, so that we can inherit some buzz from SaaS and PaaS. The whole approach has been devised so that it looks like a helpful service to our clients, while we perform data architecture tasks that are usually accomplished through governance.
We tell application projects that we’ll help them with all the complexity they encounter dealing with data on their project. We help with data models, we lend them our data modeling tool, and so on. We want to make our offering genuinely attractive and helpful, so that even a project manager who has dealt with us once will come back a second time.
We help business data owners by reducing complexity surrounding control of access to their data. We coordinate the formulation of enterprise-level rules for access to information, a set of rules that meet all the policy constraints about how data must be controlled and protected, and we work with the implementers to see that those rules are put into place with all applications that can access their data.
For managers, we use a very simple statement of why DAaaS is needed, based on sharing of data. If data is to be shared between applications, then the applications that share access must agree about the structure and meaning of the shared data. If they don’t agree, then our IT infrastructure will be filled with transformations, which take time and money to build, introduce delays in getting correct data, increase time to market for changes and new applications, and can introduce fragility through unforeseen side effects of changes. So the deal we offer is a simple one: we make application projects more effective through our services, and the results we produce make entire IT establishment more effective.
It’s essential to our approach that we offer help, so our internal Web site publicizes all the artifacts that we’ve developed. Our high-level enterprise data model is presented, as well as more detailed data models. We have developed the requisite high-level data model – following the conventional that such a model should have fewer than 20 entity types, which is helpful to show essential relationships that must exist across the enterprise. We also have a number of more detailed data models. All of them are available in our repository, so that applications can use them directly in their own models.
The conventional wisdom is to have that high-level enterprise data model approved by senior executives, simplifying it as needed in order to obtain that approval. We decided that senior executive approval of a ten entity type data model is not very meaningful – does anyone think that senior managers are actually saying that this data model is the one that they want, rather than some other data model? – so we regard our enterprise data model as an evolving work that’s useful in relating data across the enterprise. The approval that it has comes from the wisdom of the crowd, from its use in coordinating data sharing.
Similarly, we have a number of checklists and guidelines, and we regard them also as evolving works, and have not submitted them for any formal approval or review, and instead post them on our Web site. They also evolve, again driven by the wisdom of the crowd, based on experience.
Closely paralleling the enterprise data model is the enterprise data concepts model, a structured glossary that provides definitions for the high-level entity types in the enterprise data model and some additional concepts that are important to the enterprise. One could argue that the enterprise data concepts model should come first, but if you’re a data modeler you might sketch an enterprise data model in order to identify the basic concepts. Whichever originally came first, they are now used and developed in tandem, and in fact are related, because structure and meaning are not completely independent of each other.
We have identified that we have two types of customers: business data owners and software projects.
Business data owners have responsibility for data integrity and security. For them, we help them deal with requests for data access that come from application projects. Such requests can be quite abstract, since they can come from a project that’s developing software that never is directly accessed by any end user. We also perform the role of configuration management for access control objects, a further service to business data owners.
For application projects we can help with data models and the meaning of data. If the project is developing software, we can lend seats for our data modeling software, which also provides access to our repository of data models and our glossary. By helping projects with data models, showing them models that have already been used for some entity types their project deals with, showing them glossary definitions for these entity types, we converge data models and definitions early in the project’s life without the project even realizing that they have been “governed.” They instead think that they have been helped. Although the function of data architecture is often focused on data models, our experience shows that a more important contribution is made by producing agreement on the meanings of entity type names and attribute names.
It’s important for the enterprise data architects to have spent quality time thinking about enterprise data concepts and their definitions, so that project engagements can be productive, and add value right away.
Services for application projects are centered on a need to define and appropriately structure shared enterprise data. Rather than dive straight into how to implement those 50 attributes that comprise a project’s data requirements, we start with the enterprise data concepts model to help identify which concepts are central to the project’s data requirements. This approach doesn’t take much time, but right away introduces higher-order thinking; it helps us lift our heads above the project’s parochial needs and enables the project to see how their data fits into the bigger picture of enterprise data. There will likely be more than one enterprise concept at play within a project. At this point, the enterprise data model helps by showing how the concepts should relate to one another.
Having an understanding of the enterprise concepts within a project enables us to determine which reusable logical models are appropriate to the project. We evaluate the project’s major entity types and suggest naming alternatives and definitions. The goal is to align the project’s relational data to the enterprise framework in order to achieve consistency. It is important at this point to realize that the job of the enterprise data architect is to stay at the high level – to focus only on the major entity types and only on shared enterprise data. Project architects love to dive into vast amounts of implementation detail, and wouldn’t mind engaging in debate over arcane puzzles. The enterprise data architect must stay above that fray.
If necessary, the engagement can extend to customizing the reusable logical models with project-specific detail. The important point here is to always remember that the project customizations should be informed by the reusable models – meaning, the major entity types and their major relationships should always be present after the customization.
Later in the project’s life cycle, the project will identify data services. The data services should use standardized XML to deliver data to the requesting applications. The terms and structure of the XML should reflect the same concepts and definitions as the logical models that came before.
We’ve outlined an approach to data architecture that is easy for all the parties involved to accept. Executives can see that tangible benefits are produced because overall IT operation is improved by the avoidance of costly transformations. Project managers see that their project benefits because their work is reduced by the help they receive. However, it’s fair to ask how the results of this approach compare with what can be accomplished using a top-down approach.
At first blush, it might appear that DAaaS provides no governance at all, just happy help for everyone involved. That’s not the case; much of what happens during top-down data governance happens with DAaaS. The engagement process has been devised to achieve many of the results of top-down governance without actually subjecting projects to formal top-down governance.
But consider the alternatives. DAaaS can be used when the structure of the enterprise just doesn’t permit the top-down approach. In such a case, the alternative to DAaaS is not top-down; the alternative is no enterprise governance of data. The results obtained by DAaaS are clearly better than nothing.
There is one aspect of data architecture that is missing from the DAaaS approach: the enterprise strategy for data. The people carrying out DAaaS need to work toward enterprise goals, and broad agreement on those enterprise goals is important. In many decentralized organizations, it is possible to create an enterprise data strategy, even if it can’t be enforced. In that situation, DAaaS is well-adapted.
- Jaime Fitzgerald included some of these concepts in his presentation at Enterprise Data World 2010 in March of 2010. See http://fitzgerald-analytics.com.