“Focus on everything, and you have not actually focused on anything.” Eliyahu M. Goldratt, in The Haystack Syndrome: Sifting Information Out of the Data Ocean
In TDAN 5.0, I focused on the “Data” component of a comprehensive IT Component-Framework Model. [To find the article from TDAN 5.0 click here.] We
proposed the following definition for the architectural component-type “Data”: [Data is] a persistent set of propositions, resulting from observation, which are instantiated by the assignment
of values to labels. Then, concentrating on the interfaces (common boundaries) “exposed” by this component, we presented a diagram in Microsoft Component Object Model [COM] notation. Labels
were assigned to the interfaces which the Data component shares with other component-types including Human, Software, Hardware and Data.
Enhancing the value of the data component is usually the goal of the Data Administration (DA) function within a given enterprise. DA focuses on the data component, specifically its values
(e.g., data quality), and its labels (i.e., “meta”-data). DA’s goals are challenging enough, but important issues persist above and beyond successful management of the data component
alone. Even in cases where DA makes important progress in enhancing the value of the data resources of an enterprise, these improvements are likely to be limited by constraints on the
interfaces of the data component. Confronting and dealing with these constraints is the responsibility of a function which we’ll designate “Data Interface Management”.
What techniques are available to help Data Interface Management perform this function? One technique is the “Theory of Constraints” (TOC), developed by Eliyahu M. Goldratt(1). TOC suggests that
in order to achieve the greatest improvements in a given system, we should focus on its weakest links, or constraints. Data Interface Management can put IT architecture framework to use by
focusing on the constraints on each data interface, using these TOC “focusing steps”:
-
Identify the constraints: So far we have explicitly labeled and described each interface, which is prerequisite to identifying the constraints on the interfaces. Then, performance
measurements should be identified for each interface. “Not just any measurements, but measurements that will enable us to judge the impact of a local decision on the global goal.”(2)
-
Exploit the constraints: Exploit means to get the most out of existing resources. This is the domain of the “quick fix”. In general, this means optimizing the performance of
available technologies, to the extent feasible within the current fiscal period’s constraints. Optimization is measured by increases in the values of the relevant measurement units.
-
Elevate the constraints: Elevate means to apply additional resources to the constraint. This requires investigating, planning for, and acquiring new and expanded techniques and
technologies for optimizing interfaces to the greatest extent possible within the next fiscal period’s constraints and beyond. Sustainable, ongoing optimization is made possible through
the implementation of durable and sustainable technical and technological standards.
Examples of Data Interface Management’s use of the TOC focusing steps would then include, but of course not be limited to, the following.
The Data-Hardware Interface (iHarDat)
Interface Constraint Identification
- The performance of data depends on hardware–the installed and available data storage and data transmission technologies.
- Examples of interface measurement units include bandwidth, seek time, transfer rate, and storage density.
Interface Constraint Exploitation
- Knowledge transfer, cross-training and communication between the Storage Technology staff and Data Interface Management is initiated.
- Storage Technology staff and Data Interface Management work together to optimize available data-hardware technologies–to the extent feasible–within the current fiscal period’s constraints.
Interface Constraint Elevation
- Storage technology and Data Interface Management staffs work together to investigate, plan, acquire and expand data-hardware technologies to optimize this interface to the extent possible
within the next fiscal period’s constraints, and beyond. - Durable and sustainable standards for data-hardware interface technologies are investigated, and implementation begun. The Data-Human Interface (iHumDat)
Interface Constraint Identification
- The performance of data depends on humans, e.g., those by whom it is created and interpreted.
- The primary interface measurement unit is the data inventory turnover ratio, a measurement of the amount of informing that happens within a specified time period, compared to the gross amount
of data in storage. This will be discussed in more detail in future installments.
Interface Constraint Exploitation
- Data Interface Management works to increase the level of knowledge transfer, cross-training and communication between data providers and data consumers (i.e., business users), and between these
groups and Data Interface Management. - A durable and effective data-stewardship program is initiated, with appropriate links to job performance measurement.
- Data Interface Management, working in conjunction with data providers and data consumers in business areas, optimizes available data-capture and interpretation technologies, to the extent
feasible within the current fiscal period’s constraints.
Interface Constraint Elevation
- Durable and sustainable standards for data-capture and data-presentation technologies are investigated, planned, and implementation is begun.
- Data Interface Management, in conjunction with data providers and consumers, investigates, plans for, and directs the acquisition of innovative data-capture and data-presentation and
technologies, to optimize this interface to the extent possible within the next fiscal period’s constraints and beyond.
The Data-Data Interface (iDatDat)
Interface Constraint Identification
- The performance of data depends on other data at the same level of abstraction.
- Example interface measurement units include counts of identified, implemented, authentic functional dependencies and foreign-key references across the data resource.
Interface Constraint Exploitation
- Data Interface Management facilitates communication between data providers, data consumers, and Data Administration, with the goal of increasing the number of identified, implemented, authentic
functional dependencies and foreign-key references across the data resource.
Interface Constraint Elevation
- Durable and sustainable standards for identification and implementation of functional dependencies and foreign-key references across the data resource are investigated, planned, and
implementation is begun. - Data Interface Management works with data providers, data consumers, and Data Administration with the goal of identifying new functional dependencies and foreign-key references with
demonstrable business value across the data resource. Typical examples are customer-record matching and householding.
The Software-Data Interface (iSofDat)
Interface Constraint Identification
- Given the current state of information technology, the performance of data depends on software.
- Software-data interface performance measurement will quantify the value added by a software component to the data it accepts as input, and/or provides as output. Input value enhancement
includes integrity and domain validation, for example. Output value enhancement includes deriving additional data, drawing conclusions, and determining inferences, patterns, or relationships to
other data.
Interface Constraint Exploitation
- Implement automated data quality and integrity monitoring to the maximum extent possible.
Interface Constraint Elevation
- A powerful example of elevating the software-data output interface is the implementation of data mining. Data mining technologies seek to extract the maximum value from available data.
Data Refinement (iRefine)
Interface Constraint Identification
- The performance of Data depends on (meta)data, i.e., data at a higher level of abstraction, including labels.
- Example interface measurement units include counts of authentic, user-accessible data-meta data linkages.
Interface Constraint Exploitation
- Data Interface Management assists Data Administration in identifying existing meta data and its inferred or explicit relationships to physical data, through both forward and reverse data
engineering.
Interface Constraint Elevation
- Implement meta data repository and links to physical data.
- Standardize meta data formats as well as forward and reverse data engineering techniques and technologies.
Current State of Data Determines Future State of Data (iDeterm)
Interface Constraint Identification
- The performance of future data depends on the state of current and past data.
- Interface measurement will quantify, for example, the effectiveness with which historical data is retained relative to demands for accessibility; also, the growth rate of the total data
resource of the enterprise.
Interface Constraint Exploitation
- Utilize available data staging and hierarchical storage management systems to the greatest extent possible.
- Begin plans for enhanced data acquisition in support of near-term business opportunities.
Interface Constraint Elevation
- Implement enhanced data acquisition in support of near-term business opportunities.
- Begin plans for enhanced data acquisition in support of mid- and long-term business opportunities.
- Investigate and implement advanced data retention and archival technologies.
- Investigate and implement standards and utilities for data migration, with the objective of maximizing the independence of the data resource from its storage mechanisms.
In summary, much of this may sound very familiar. Data Interface Management encompasses most, if not all, of the current techniques and technologies for data warehousing. The current
success enjoyed by data warehousing can be attributed to its focus on elevating the constraints on data interfaces, the third of the TOC focusing steps. Significant opportunities
for identifying and exploiting constraints on data interfaces remain to be explored. Within the context of the IT Component-Framework Architecture, data warehousing can be
understood as a subsidiary discipline of Data Interface Management.
(1) TOC is the subject of all the books of Eliyahu M. Goldratt, including The Goal, The Race, Theory of Constraints, Critical Chain, and The Haystack Syndrome.
(2) Eliyahu M. Goldratt, The Haystack Syndrome: Sifting Information Out of the Data Ocean, North River Press, 1990