Practical Data Stewardship Structures

Published in TDAN.com January 2006


Introduction

Recent studies have indicated and have clearly proven that bad data costs money; results in poor and uninformed decision-making and eventually missed business opportunities. In recent times,
companies have engaged resources to clean up their customer data to enable CRM-related initiatives but the focus has now shifted to data in other areas of the business, such as supply chain and
finance, and to tackling what can seem like unyielding data quality problems in nearly every business domain. This article attempts to look at the Data Stewardship Program and provides an overview
of how this could be implemented in an organization.


What is Data Stewardship?

Data Stewardship is a program that essentially involves going beyond efficient to being effective. It involves increasing operational effectiveness by aligning the organization with strategic goals
and enabling information to be a strategic asset ultimately resulting in better decision making. There is better decision making due to more accurate, pertinent and timely data. Operational costs
are reduced due to resource scalability and resource re-usability.


Program Implementation Approach

As with any other program, this also requires careful and meticulous planning in order to be successful and effective. There are multiple approaches towards implementing a Data Stewardship Program.
A brief overview of the 2 approaches is presented below:

(1) Dedicated Team (All full-time resources)

  • Pros
  • Since all the resources are full-time, the team can solely focus on measurement and improvement of data quality
  • Cons
  • Requires more investment from the organization

(2) Virtual Team (All part-time resources)

  • Pros
  • A more practical approach for organizations that are just beginning to invest in the data quality program
  • Cons
  • Since the team is not full-time, day-to-issues not related to data quality may distract the team from focusing on solving pure data quality issues.

Each approach needs to be weighed and evaluated in light of the specific needs of the organization. Most companies adopt a hybrid approach where in the team consists of part-time and full-time
resources. Irrespective of the approach, the data stewardship team would consist of the following key roles as shown in the diagram below:

Chief Data Steward

The Chief Data Steward leads the Data Stewardship program and possesses a unique combination of business, technology and diplomatic skills. This person would be politically astute and excellent at
conflict resolution. When there are multiple approaches and the Subject Area Data Stewards, Business Analysts and Information Architects cannot seem to be able to agree on an approach along with
the Subject matter experts, this person will step in to facilitate the best approach in the interests of the larger organization. This person is responsible for guiding the journey of data quality
journey and does not treat it as a destination or project. It’s not a job for someone steeped in technical knowledge nor is it a business person who is unfamiliar with technology. This individual
is responsible for providing the overall DQM strategy and vision and comes up with the governance charter for implementing data quality. He/she will make sure that the roles and responsibilities
for the DQ team are defined appropriately. This individual will evangelize the adoption of DQM measures internally and externally with partners and will oversee the Net Present Value (NPV) Analysis
for various data quality initiatives and seek appropriate funding from the business sponsor of this program.

Lead Steward – Business Analysis

The focus of this lead is to guide the individual subject area data stewards in getting all the business rules documented and implemented with the help of the Subject Matter Experts and also for
making proposals that define the way in which the company uses information in a consistent manner. This person would have a thorough knowledge of the business as they would need to make frequent
judgment calls. This includes the ability to determine when you don’t need 100% perfection. The Lead Steward is also responsible for analyzing the costs of fixing or cleansing the data versus the
payback. The Lead Steward – Business Analysis guides the individual data stewards to focus on content not on the data structure. Although they don’t own the content, they facilitate the process to
send defects back to the owner of the data. Knowledge of the data structure is also important as it helps shape the data strategy for the entire organization. He/she will also initiate meetings
with the data owners to review progress towards pre-defined data quality goals. They drive the work of assessing business impact of data quality problems and work with the individual data stewards
to develop and recommend solutions for pro-actively solving them.

Subject Area Data Steward

This individual would have the responsibility for one or more subject areas. For companies that are newly embarking on this program, it is not uncommon for the DQ team to start off with one or two
subject area data stewards lead by a Lead Steward from the Business Analysis team. They will perform data profiling using data discovery, structure discovery and relationship discovery techniques
and also work with the Engineering ETL teams to implement the business rules. They will ensure that data definitions adhere to appropriate enterprise and industry standards.

Data Quality Auditor

This individual is responsible for devising and continuously refining the metrics used for data quality and for measuring the success of the data stewardship program. They will monitor ETL
processes and report data quality problems using a combination of tools and manual sampling techniques. They will work with the Lead Steward – Business Analysis to co-ordinate data quality fixes.
If needed, this person will work with the business units to have the new extracts re-processed from the appropriate sources. This individual is responsible for conducting periodic data quality
assessment and maintenance procedures and raise awareness of the data quality risks. For bigger organizations, this could be a small team led by a Lead Data Quality Auditor.

Subject Matter Expert

The SME is a domain expert for a certain business area. For example in the healthcare industry, a pharmacist who understands the data entities used would qualify to play this role. They understand
the data in and out and provide high-level business rules to the Subject Area Data Stewards for further refinement and implementation.

Information Architect

The Information Architect is responsible for driving the technical evaluation and implementation of tools and procedures to be used for data modeling and capturing meta-data for various subject
areas within the organization. They will lay down the architectural framework for data centralization, consistency and control.

Technical Team Lead – ETL and Tools

This person is responsible for managing ETL developers that write the code for implementing the business rules and also tools developers that write the tools used by the Data Quality Auditor for
data quality monitoring and measurement.


Conclusion

In most organizations, the above roles are combined but as the data stewardship program gets bigger in scope, it might be beneficial to have these roles played by separate individuals. A few
enterprises recognize the importance of data quality and typically address the problem tactically. It is only companies that view data quality as a strategic business issue and that approach it
accordingly by implementing the right Data Stewardship programs are likely to achieve improvement.

Share

submit to reddit

About Diby Malakar

Diby works as a Principal Product Manager for Informatica, the data integration company. His prior experience includes working as Acting-VP of Engineering and Senior Director of Product Management for Cloud9 Analytics, a SaaS-based startup focused on building sales performance management applications. He has also worked for companies like KPMG, Neoforma and TiVo in various engineering management and product management capacities. He has more than 14 years of experience in the information technology industry and specializes in business intelligence, data warehousing and data quality management. He holds a Bachelors degree in Computer Science and a MBA in Information Systems.

He is an active member of IAIDQ and the Program Director for the San Francisco chapter of DAMA. His articles have been published in a variety of places such as TDAN, Wharton’s Leadership Digest and Oracle Magazine.

Top