A Practical Tool for Data Stewards

BLG02x - image_rData stewardship is an important role in today’s data-driven business organizations. Data stewards facilitate consensus about data definitions, quality, and usage. They guide activities to complete metadata, improve data quality, and ensure regulatory compliance. Stewards are also responsible for making recommendations about data access, security, distribution, retention, archiving, and disposal.

Unfortunately, typical data stewardship practices often don’t measure up to the importance of the role. All too frequently, data stewards are identified and assigned responsibility without the time and training to do the job well. When we designate busy people as data stewards without making time for them to do stewardship work, we should not expect high-impact results. Nor should we expect success without training stewards about roles, relationships, and accountabilities related to data.

Along with time and training, data stewards need tools that help them to do their work. This article offers a simple tool to help diagnose data problems and find the path from symptoms to causes, and from causes to solutions. The tables below identify common symptoms of data challenges that data stewards frequently encounter, grouped by ten core data management processes – naming data, defining data, designing data, managing quality, integrating data, accessing data, managing metadata, administering databases, managing systems, and governing data. Common causes of and solutions to data problems are identified for each process.

To use the tool, begin by browsing the index of common symptoms to find those related to your data management issues. Then use the associated numbers to find each in symptom in the process tables. Note that a single symptom is often listed in several of the process tables. Explore the processes, causes, and solutions to develop problem-solving ideas and plans.

Index: Common Symptoms of Data Problems

application integration difficulty 47 inefficient business analysis 26, 39
business rule violations in data 31, 40, 90 insufficient data storage capacity 72
can’t access needed data 52 lack of data definitions 10
complex system interfaces 48, 81 lack of trust in data 32
conflicting documentation 64 large change request backlog 20
confusing abbreviations 5 limited data sharing 49
confusing documentation 67 lost data can’t be recovered 80
corrupted data can’t be repaired 79 meaningless data definitions 12
data consolidation difficulties 98 meaningless data names 1
data not available when needed 58 missing documentation 62
data ownership conflicts 99 misunderstood data 15, 69
data privacy compromised 53, 55, 89, 93 multiple names & aliases 6
data retention/disposal uncertainty 97 need for data standardization 100
data security compromised 54, 88, 92 needed access not authorized 56
data-related compliance violations 94 needed features not implemented 78
difficult-to-use data 28, 37 non-unique data names 2
disaster recovery uncertainties 95 obsolete data definitions 13
enterprise reporting difficulty 46 obsolete permissions still active 57
excessive database downtime 76 outdated documentation 66
failure to meet business needs 21 overlapping and conflicting data 44
hard to find data definitions 14 poor application performance 83
hard to find documentation 65 poor data access performance 60
hard to find needed data 51 poor data quality 86, 91
hard-to-navigate databases 29, 59 poor database performance 30
hard-to-identify data 8 poor query performance 74
hard-to-navigate databases 29 poor structural integrity 19, 33
high level of data disparity 9, 17, 25, 43, 70, 85 poor update performance 75
high level of data redundancy 18, 27, 71, 84 shadow databases 41
inadequate metadata 87 shadow systems & databases 23
inappropriate use of data 16 spreadsheet proliferation 22, 42, 50, 61
incomplete data 36 structureless data names 4
incomplete documentation 63 territorialism inhibits data sharing 96
incorrect data 34 unanticipated growth problems 73
incorrect data definitions 11 unnamed data components 7
incorrect data names 3 unreliable database connections 77
incorrect reporting 24, 38

 

Naming Data

Symptoms Causes Solutions
1 meaningless data names – informal naming practices- lack of naming standards- standards – data naming taxonomy- data naming vocabulary- standard naming structure

– standard abbreviations list

– compliance incentives

2 non-unique data names
3 incorrect data names
4 structureless data names
5 confusing abbreviations
6 multiple names & aliases
7 unnamed data components
8 hard-to-identify data
9 high level of data disparity

 

Defining Data

Symptoms Causes Solutions
10 lack of data definitions – lack of data definition standards- poor data definition practices- lack of business participation

– legacy databases

– disparate metadata

– data definition standards- data definition templates- data definition wiki

– business/tech collaboration

– data definition review

– metadata repository

– definitions system-of-record

11 incorrect data definitions
12 meaningless data definitions
13 obsolete data definitions
14 hard to find data definitions
15 misunderstood data
16 inappropriate use of data
17 high level of data disparity
18 high level of data redundancy

 

Designing Data

Symptoms Causes Solutions
19 poor structural integrity – poor modeling techniques- wrong choice of model type- poor business representation

– excessive detail

– insufficient detail

– process-oriented design

– application-oriented design

– data model standards- E-R model guidelines- dimensional model guidelines

– normalization guidelines

– atomic data guidelines

– aggregate data guidelines

– subject-oriented design

– consumer-oriented design

20 large change request backlog
21 failure to meet business needs
22 spreadsheet proliferation
23 shadow systems & databases
24 incorrect reporting
25 high level of data disparity
26 inefficient business analysis
27 high level of data redundancy
28 difficult-to-use data
29 hard-to-navigate databases
30 poor database performance
31 business rule violations in data

 

Managing Data Quality

Symptoms Causes Solutions
32 lack of trust in data  poorly defined DQ rules- missing DQ rules- absence of quality measures

– absence of quality reporting

– lack of accountability

– incomplete/incorrect edits

– DQ rules taxonomy- defined DQ rules- DQ metrics and measures

– published DQ reports

– regular DQ audits

– designated DQ accountability

– DQ tasks in project plans

33 poor structural integrity
34 incorrect data
35 untimely data
36 incomplete data
37 difficult-to-use data
38 incorrect reporting
39 inefficient business analysis
40 business rule violations in data
41 shadow databases
42 spreadsheet proliferation

 

Integrating Data

Symptoms Causes Solutions
43 high level of data disparity – lack of integration architecture- technology-driven integration- inadequate data warehouse

– absence of data marts

– unmanaged master data

– poor integration practices

– missing/wrong data sources

– sound integration architecture- business-driven integration- sound warehousing design

– targeted data marts

– master data management

– integration best practices

– defined data sourcing criteria

44 overlapping and conflicting data
45 untraceable data
46 enterprise reporting difficulty
47 application integration difficulty
48 complex system interfaces
49 limited data sharing
50 spreadsheet proliferation

 

Accessing Data

Symptoms Causes Solutions
51 hard to find needed data – missing metadata- inadequate data access tools- insufficient indexing

– inadequate search capability

– lack of content management

– poor user interface

– excessive downtime

– database design not user
friendly

– ineffective performance tuning

– ineffective security processes

– robust metadata- user-friendly tools and interfaces- indexing and searching

– data access portals

– service level agreements

– service level accountability

– published service level metrics

– security policies & procedures

– periodic security/privacy audits

– security/privacy accountability

52 can’t access needed data
53 data privacy compromised
54 data security compromised
55 data privacy compromised
56 needed access not authorized
57 obsolete permissions still active
58 data not available when needed
59 hard-to-navigate databases
60 poor data access performance
61 spreadsheet proliferation

 

Managing Metadata

Symptoms Causes Solutions
62 missing documentation – casual metadata management- fragmented metadata tools- lack of documentation
standards

– lack of data modeling
standards

– undocumented changes

– no documentation incentives

– no documentation reviews

– “rush to production” projects

– metadata templates & guidelines- project metadata standards- maintenance metadata standards

– metadata registries & portals

– metadata system-of-record

– metadata accountability

– metadata tasks in project plans

– incentives and reviews

63 incomplete documentation
64 conflicting documentation
65 hard to find documentation
66 outdated documentation
67 confusing documentation
68 untraceable data
69 misunderstood data
70 high level of data disparity
71 high level of data redundancy

 

Administering Databases

Symptoms Causes Solutions
72 insufficient data storage capacity – ineffective storage
management- passive growth management- ineffective performance tuning

– unscheduled maintenance

– inadequate database
connectivity

– outdated DBMS versions

– insufficient backup & recovery

– continuous capacity planning- proactive growth management- performance SLAs

– availability/uptime SLAs

– connection protocol standards

– connectivity SLAs

– routine DBMS upgrades

– backup & recovery practices

73 unanticipated growth problems
74 poor query performance
75 poor update performance
76 excessive database downtime
77 unreliable database connections
78 needed features not implemented
79 corrupted data can’t be repaired
80 lost data can’t be recovered

 

Managing Systems

Symptoms Causes Solutions
81 complex system interfaces – lack of data sharing
architecture- lack of integration architecture- poor application design

– “quick fix” maintenance

– “misfit” acquired systems

– inconsistent data formats

– limited reuse of data functions

– testing with production data

– application architecture standards- application design review- maintenance & testing standards

– application acquisition guidelines

– data sharing incentives

– database wrappers

– SOA-based access & update

– reusable data quality rules

– managed test data

82 untraceable data
83 poor application performance
84 high level of data redundancy
85 high level of data disparity
86 poor data quality
87 inadequate metadata
88 data security compromised
89 data privacy compromised
90 business rule violations in data

 

Governing Data

Symptoms Causes Solutions
91 poor data quality – lack of data management goals- unclear, uncertain, ambiguous
or misaligned RAA
(responsibility, authority, & accountability)- poor P&P
(policies & procedures)

– understaffed data management

– underfunded data management

– “data as an asset” culture- clear data management goals- quality RAA + P&P

– security RAA + P&P

– privacy RAA + P&P

– compliance RAA + P&P

– disaster recovery RAA + P&P

– designated data ownership

92 data security compromised
93 data privacy compromised
94 data-related compliance violations
95 disaster recovery uncertainties
96 territorialism inhibits data sharing
97 data retention/disposal uncertainty
98 data consolidation difficulties
99 data ownership conflicts
100 need for data standardization

 

 

Share

submit to reddit

About Dave Wells

Dave Wells leads the Data Management Practice at Eckerson Group, a business intelligence and analytics research and consulting organization. Dave works at the intersection of information management and business management, where real value is derived from data assets. He is an industry analyst, consultant, and educator dedicated to building meaningful and enduring connections throughout the path from data to business value. Knowledge sharing and skills development are Dave’s passions, carried out through consulting, speaking, teaching, and writing. He is a continuous learner – fascinated with understanding how we think – and a student and practitioner of systems thinking, critical thinking, design thinking, divergent thinking, and innovation. He can be reached at dwells@eckerson.com.

Top