Architecture Is Objective, Design Is Subjective – August 2009

In my previous article,  I mentioned the four questions that data architecture is trying to address:

  • What business activities are being carried out?
  • What information is being processed?
  • Which business activities require what information?
  • Where is the information located?

I also mentioned four standard views in the architectural framework that integrate into a single holistic view of the enterprise, which were:

  • The information or data view answering the “What information is being processed?” question.
  • The functional business or domain view answering the “What business activities are being carried out?” question.
  • The integration or data-flow view answering the “Which business activities require what business information?” question.
  • The deployment or technology view answering the “Where is the information located?” question.

This time around, we are going to discuss domain decomposition – the systematic analysis of a domain in order to decompose it into simpler, self-contained and discrete sub-domains that
can, as far as possible, be described and analysed independently of each other.

This essentially produces the domain view and the data-flow view and allows us to answer the what, who, when and (if it’s relevant) the why” aspects of the business and allows us to
identify “owners” and “influencers” of each significant business activity.

As part of that, we’ll go through the purpose a domain decomposition serves, the benefits it provides, the problems and the deliverables that should be produced. Hopefully, we’ll also
establish a core set of objective principles that the resulting domain decomposition should conform to.


Why We Need a Formal Domain Decomposition
A domain is generally defined as any discrete subject area representing a field of action, thought or influence – an industrious or systematic activity undertaken to achieve a pre-defined
objective.

Even though it may not be apparent from the outside, all domains have an internal structure formed from a set of smaller interlocking domains (sub-domains) performing well-defined functions within
the overall domain.

Given that we are discussing enterprise data architecture, then the domain that we are focussing on is the enterprise (i.e., an organisation created to carry out a defined set of business objectives)
but even then an enterprise domain can range from a single-task focussed organisation through to the most complex “supra-enterprise” domains covering business activities across multiple
business sectors such as retail banking, investment banking and asset management. The more complex the domain, then the more likely that it will be necessary to decompose it into manageable
sub-domains.

Within the enterprise domain, there will be many functional business areas (sub-domains) performing well-defined activities that contribute to the enterprise achieving its objectives. This can be
frequently seen from the organisation chart that an organisation may create to describe itself such as the following:

 

Figure 1

However, this is nearly always a “social structure” representation describing “reporting lines” and areas of responsibility. Unfortunately, at the operational level, it is
much more complex and would probably look something like this:

Figure 2

In addition, each of the main functional business areas will also have numerous internal activities, invisible to other parts of the enterprise, which it needs to perform and will also involve
information exchange between these sub-domain functional areas.

Figure 3

Plus, of course, each of these may in turn be further decomposed into other areas that are even more specialised. It gets incredibly recursive if we don’t have rules for when to stop!

None of this will come as a revelation to anyone well versed in any of the structured analysis methodologies such as SSADM (Structured Systems Analysis & Design Methodology) or Jackson Functional
Decomposition (if you’re an old-school software engineer).

In many smaller organisations, much of this cross-functional activity is unstructured with much of the information flow being informal and often producing a formal domain decomposition may be
desirable but is usually unnecessary.

However, for large global organisations that have dedicated and, in many cases, replicated business activities operating across multiple locations, it’s not unheard of for the organisation to
implement multiple systems to support business activities at the local level (e.g., regional accounting) and then consolidate the generated business information in another location (the data
warehouse usually) for reporting, analysis and decision making.

In this situation having a decomposition of the enterprise domain to explicitly describe these data flows in a single integrated model and having a formal policy on how the sub-domains will interact
with each other should be a critical part of the basic distributed data architecture.

As well as a rationale for the domain decomposition activity, we should also attach some concrete objectives to producing a well structured domain. Some benefits are:

  • Dividing a complex problem into a simpler set of components makes the problem easier to resolve which, in turn, makes it easier to define the activities of each identified area of the business.
    This will save both time and money overall.

  • The domain decomposition covers the what, who, when and (if it’s relevant) the why aspects of the business and allows us to identify “owners” and “influencers” of
    each significant business activity that we need to consult whenever we want to change any of these aspects. This minimises the amount of time spent on impact analysis when change occurs.

  • Minimising the interdependency between the data components so that each business area only has minimal knowledge of the internal structure of another business area in order to interact with it
    and so reducing the cascading impact of change. The “de-coupling” principle is the core tenet of any messaging, client-server or service oriented architectures.

  • A prescriptive (i.e., principles driven) decomposition is easier to justify to business stakeholders when proposals are put forward to restructure a business activity to improve overall
    efficiency.

  • If we identify and govern the domain boundaries, then future data requirements can be linked into the distributed environment without significant refactoring. This minimises the impact of
    change over time.

  • Publishing a formal decomposition discourages the tendency for different business areas to create their own bespoke variations of existing corporate data assets. This reduces long-term support
    costs.

  • In addition, an organisation could have any number of specific management objectives that they want to achieve such as maintaining separate operating divisions or outsourcing a particular
    business activity to a third-party provider. For example, many investment banks outsource their trade settlement activities to external organisations and thus are interested in defining what the
    interface information looks like and the information required to carry out trade settlement but not interested in how the data is managed within the settlement component.


Architectural Principles for a Domain Decomposition
Given the above benefits and objectives, the resulting deliverables from the domain decomposition would be:

  • Identifying a set of functional business areas for grouping information together into logically self-contained components that can, as far as possible, be deployed and managed in isolation from
    the other data components.

  • Identifying the high-level information flows that need to exist between functional business areas (i.e., the business activities) that will be supported.
  • Establishing “owners” of each information flow by assigning it to one of the functional business areas.
  • Establishing the “influencers” of a business activity. These are the functional business areas that will be affected if the business activity changes so would need to be
    “consulted” when any changes are proposed to ensure that they can comply with the change.

  • Identification and establishment of coherence boundaries – the points at which different functional business areas have to communicate with the outside world in a consistent and
    grammatically structured language.

Although listed as separate artefacts, they would be produced using a single model that is developed iteratively by identifying potential functional business areas and business activities and
possibly merging or splitting the candidates as a result of analysing the interdependencies.

It is the decisions that we make as a result of the characteristics of the interdependencies that we need to define for the architectural principles.


Identifying Functional Business Areas
Traditionally the identification of functional business areas and information flows or business activities would be an output from the business analysis carried out separately, as a precursor,
to the production of the enterprise data architecture. However, analysis of business activities tends to take place at a highly granular level with specifications being produced for individual
processes, and the enterprise data architecture might well be the first place that the integrated holistic view is considered.

In some cases, the functional business areas are identifiable vertical business areas such as finance, sales & marketing, human resources or product manufacturing; and in other cases, they are
cross-functional “horizontal” areas such as customer service or business intelligence.

Both perspectives may be present in an organisation, and an initial candidate list can be discovered by examining the company structure if the business is already operating (a
“brownfield” enterprise); or if it is a start-up company (a “greenfield” enterprise), then there will be similar organisations in the same business sector that we can use as a
template.

This will give us the initial candidate list that we can rationalise into a definitive list by assigning ownership of business activities to potential functional business areas. The functional
business areas on the final list will all have the following characteristics:

  • A functional business area must provide at least one business activity on behalf of the enterprise and must own something that it has sole responsibility for.

    If this is not the case, then the candidate functional business area is really just a role that some other functional business area is performing with respect to that business activity in which
    case the candidate functional business area can be removed.

  • Each sub-domain should represent at least one major business concept (as defined in the previous article on the
    business information model
    ), e.g., sales, purchasing, customer relationship management (marketing), inventory control, products.

    Ideally, there should be one major business concept per sub-domain but tightly bound business concepts will likely be grouped together.

If an initial list of functional business areas is not available from the business analysis, then we can create the initial list by using the second characteristic above to form a potential list
of functional business areas from the primary business entities from the business information model and initially partition each of these as a separate sub-domain.


Identifying Information Flows between Functional Business Areas
Identifying the business activities is again an output from the business analysis activity and would normally be a pre-defined input to producing the data architecture. Frequently though when we
ask the business stakeholders for a list of business activity information flows, we usually get a list of use cases such as the following:

Figure 4

In many cases, no more information is available because the process specifications haven’t yet been produced. This unfortunately isn’t really that useful from an architectural perspective
because it doesn’t tell us (1) what information is being used or produced by each activity; (2) how the separate activities link together; and (3) who owns each defined activity because the actors
are roles that use a service not those that provide it.

Instead we need a good old-fashioned data-flow model that shows the data that is being consumed or produced by each business activity, so instead of the above we have something like the
following:



Figure 5 (mouse over image to enlarge)

I’m not going to discuss the detailed process for producing this data-flow model because it’s not really the subject of this article; but, as with the business information model, the
data-flow model doesn’t have to be fully defined in order to be useful.

We only need to know what business entities are being manipulated by each activity but don’t necessarily need to know exactly what attributes of those entities are required as input to a
particular activity or need to define the exact internal structure of each business entity. This can all be defined in detail as part of the process specifications, which may be produced beforehand
or afterwards.

What the data-flow model does give us are all the high-level details necessary to establish the ownership of each information flow.


Establishing “Ownership” of an Information Flow
“Ownership” of any asset is an important responsibility, and this area is fraught with corporate politics for the obvious reason that the more responsibility someone has, then the
greater their political influence over the activities of other business areas.

Data assets are no different than any other assets in this respect; consequently, it is almost inevitable that multiple business areas will try to claim “ownership” of any business
processes and associated data assets that sit on or close to their sub-domain boundary so as to (1) be able to control the rate of change of that data asset, (2) be able to control the business
processes related to that asset and (3) be able to strongly influence any other business area that is dependent on that asset.

This, in a business context, is what ownership should mean:

  • The owner is responsible for making the data available to other business areas as and when those business areas require it.
  • The owner has responsibility for the definition and governance of the business entity under its stewardship and is responsible for enforcing a conformance to that definition and the data
    quality of any instances of the business entity.

  • The owner is responsible for providing an authoritative source for whatever data is managed within that sub-domain. An authoritative source is a definitive repository that anyone who wants to
    verify the veracity of a piece of data would have access to.

Because of the significance of ownership and to avoid the vested interests dictating the decision making, we need to define some rules for establishing ownership.

  • The owner of a business entity also owns all aspects of that business entity including any “component parts” (i.e., secondary classes that are directly composed into a primary
    business entity should be placed in the same domain as the composing entity).

  • The owner of a primary business entity should also be the owner of all sub-types of that primary business entity.
  • All relationships across domain boundaries should be unidirectional. This is important in order to establish which class is being referenced and which is doing the referencing. If an
    association is not unidirectional, then both classes involved in the association should reside in the same functional business area.

Essentially, by applying these rules we are trying to convert the data-flow model described above into a cross-functional data-flow model such as the following:



Figure 6 (mouse over image to enlarge)

The business activity and business information are then “owned” by the “swim-lane” in which they reside. From this, we can then extract the business entities and group them
together to form the business concept map described in the previous article.

 

Figure 7

Note: As mentioned previously, this isn’t a fully detailed model because it doesn’t need to be at this stage. There are, for example, relationships between the business entities that
haven’t yet been described (the three shown are just examples) and some of those business entities will have complex internal structure (e.g., customer order will probably be composed of order
header and multiple order lines). This detail can be added separately.

That is essentially our domain decomposition showing which functional business area owns what information. Of course, we’ll have a multitude of business activities to analyse, but that’s
just a repeated application of the same rules to generate more detail.


Establishing the “Influencers” of an Information Flow
The “influencers” of a business activity are the functional business areas that will be affected if the business activity changes so would need to be “consulted” when any
changes are proposed to ensure that they can comply with the change.

Having established owners of an information flow, establishing the Influencers is really straightforward. Influencers are simply the functional business areas that either provide input data into a
business activity or are the consumers of the information produced by a business activity or provide “reference data” used to validate the correctness of the business information that is
produced.

In the above cross-functional data-flow model, the influencer of the customer order is product management because that is the only functional business area that has any dependency, through the fulfil
order activity, on the definition or existence of a customer order.


Identifying Coherence Boundaries
As previously mentioned, the coherence boundaries are the points at which different functional business areas have to communicate with the outside world in a consistent and grammatically
structured language.

From our data-flow model, a coherence boundary occurs at every point that an information flow crosses from one functional business area into another functional business area. At these points, it is
essential for the owner to:

  • Establish a public service that allows any consumers of the data to access it as and when required, and
  • Publish a stable definition of the interface using commonly agreed terms and definitions that is only subject to change when the business requirements are changed and not because of any
    internal reorganisation of the functional business areas.

Having “incoherent boundaries” between functional areas is where most of the risk of process failure will occur due to misunderstandings in interface specification or data
definition. This then results in poor data quality and data errors with, no doubt, a cascading impact on downstream activities.


Conclusion
In this article, we have looked at the process and principles of producing a domain decomposition in order to sub-divide an enterprise into its functional business areas and tried to establish a
rules-driven approach to doing it.

I should come clean and say that this is presented in a highly simplified form. In reality, the domain decomposition is likely to be formed of more than one layer because of the recursive nature of
domains, and it will almost certainly evolve over time as previously undocumented business activities are discovered and need to be integrated into what’s already defined. But that’s
pretty much expected I hope.

In the next article of this series, we’ll look at the coherence boundaries in more detail because, believe it or not, there’s a lot of interesting architectural behaviour that takes place
at the boundaries that needs to be taken into consideration.

Share

submit to reddit

About Adrian Miley

Adrian Miley is a Director at Miley Watts & Associates Ltd, a UK Consultancy specialising in Distributed Data Architecture, and a Director at Taxosys Ltd, a publisher of Taxonomy Management software. He has 20+ years of experience across a wide range of business sectors in the architecture, design and build of large scale data processing environments with an emphasis on innovative solutions extracting the most benefit for the least amount of effort. He can be contacted by email at adrian.miley@mileywatts.com or via his LinkedIn profile at http://www.linkedin.com/in/adrianmiley.

Top