Data Quality and the Project Life Cycle

There are many different approaches to running a project. A project life cycle (PLC) defines the approach to a project for developing solutions. Examples of solutions include building a new
application, migrating data from existing applications to new applications, or process improvement. The PLC provides the basis for the project plan and tasks to be undertaken by the project team.
The PLC may also be referred to as the SDLC – Solution (or Software or Systems) Development Life Cycle.

There are different ways of structuring a PLC, and many companies and vendors have their own approaches to the PLC. The graph in Figure 1 shows the phases in a typical project life cycle along with
the phases as used in five other PLC approaches.


Figure 1: Project Life Cycle Comparison

No matter which project approach you use, data quality tasks can (and should) be integrated into the project plan. Careful planning at the beginning of a project will guarantee that appropriate
data quality activities are integrated into the entire project.

Data quality issues discovered early in the project life cycle are much less expensive to correct than addressing them during final testing or just before go-live. Including data
quality-related tasks during the normal course of projects will prevent many data quality problems from occurring once in production. The quality of data can make the difference between a smooth
transition and the ability of business to continue as usual or a rocky conversion and the inability to conduct even basic business activities (e.g., completing a timely financial close vs. a late
financial close, or meeting manufacturing and shipping commitments vs. expensive workarounds while issues are resolved).

Following is a brief list of data quality activities that should be considered in each of the phases of the typical PLC:

Justification
Include the impact of data quality issues during the project presentation of the business problem (or opportunity) and the proposed solution.
 
Planning
Consider data quality activities when setting the project scope, schedule and deliverables. Include data quality deliverables in the project charter. Ensure the project plan includes time and
resources for the data quality activities. Remember to plan for data quality control throughout the project for:

  • Data to be created
  • Data to be moved from existing databases and applications
  • Configuration, reference and setup data

Requirements and Analysis
Conduct data profiling and analysis that will expedite source-to-target mappings and confirm selection criteria for data to be migrated. Analyze and understand source and target databases. Define
data specifications and business rules for:

  • Data to be created
  • Data to be moved from existing databases and applications
  • Configuration, reference and setup data

Design
Use data quality profiling tools to create or validate an information model. Ensure those analyzing the data have access to the data specifications and business rules that define the quality.
Ensure a solid feedback process will be in place between those doing the analysis, those cleansing data and those writing the transformation rules. Institute data cleanup (preferably at the
sources) as early in the project as possible. If cleanup is not possible or practical at the source, determine where in the migration path (e.g., in a staging area) appropriate clean-up
activities can take place.

Development and Testing
Continue the iterative process of assessing data and providing results to those cleansing data and those writing transformation rules. Remember to
check data being newly created, in addition to existing data to be migrated. Check and ensure quality of configuration, reference and setup data. Profile and check data prior to the test loads
and after the test loads. Update data specifications, business rules and transformation rules as necessary. Profile the data to control test data for software quality assurance – understand
the data content so efforts can focus on the software functionality.

Deployment
Use quick data quality assessments to confirm data extracts are correct prior to the final data loads at go-live. Conduct quick data quality assessments after
data loads before the system is released to the users.

Post Production Support
Institute appropriate ongoing monitoring and metrics to check data quality and provide the ability to take quick action if needed.


Including these activities helps keep a project on schedule by preventing data quality surprises later in a project where they are much more costly to address (from both a resource and time
perspective). Use these suggestions to stimulate your thinking about data quality and determine how you can design good data quality into your project and resulting applications and processes.

References:

IBM Rational Unified Process. Referenced May 31, 2000.
http://en.wikipedia.org/wiki/Rational_Unified_Process.

Rational Unified Process. Referenced May 31, 2007.
http://www-306.ibm.com/software/awdtools/rup/.

SAP Best Practices for Mining, Referenced May 31, 2007.
http://www.sap.com/usa/industries/mining/pdf/SAPBestPracticesforMining.pdf.

Siebel eRoadmap Implementation Methodology. Referenced May 31, 2007.
http://www.peoplesoftcity.com/siebel78/books/TestGuide/TestGuide_Overview6.html.

Using Oracle Tutor with AIM 3.0 and the Oracle Business Models, an Oracle White Paper, June 1999. Referenced June 1, 2007.
http://www.oracle.com/applications/human_resources/tutor-aim.pdf

Share

submit to reddit

About Danette McGilvray

About the Author: Danette McGilvray is President and Principal of Granite Falls Consulting, Inc. a firm that helps organizations increase their success by addressing the information quality and data governance aspects of their business efforts. She considers herself a second-generation pioneer – someone who benefits from those before her (she has learned from and worked with many people, including the three pioneers of data and information quality in the US: Larry English, Tom Redman, and Richard Wang), and is an early innovator herself in data quality management. She continues to learn from her colleagues, clients, business partners, and everyone around her. You can reach her through email: danette@gfalls.com, LinkedIn: Danette McGilvray, or Twitter: @Danette_McG.   Danette is teaching a one-day tutorial on “Increasing Project Success Through Data Quality and Governance” at Enterprise Dataversity in Chicago, Illinois, Nov. 5, 2015. For more information see: http://edv2015.dataversity.net.

Top