As one of the core data management activities, data migration has been practiced ever since the invention of computers. However, it can be the most neglected task on IT managers’ lists of
things to do, resulting in poor quality data in the target system. The observation is not new, but is commonly seen throughout the industry. It is estimated that 84% of data migration projects
fail1. The impact of data migration project failure can be numerous ranging from:
- Breakdown of target systems
- Poor data quality in the target environment
- Loss of business opportunity
- Cost overruns, etc.
What is Data Migration?
The term “data migration” is used in several contexts for data movement activities. Let’s look at the definition of data migration:
achieve an automated migration, freeing up human resources from tedious tasks. It is required when organizations or individuals change computer systems or upgrade to new systems, or when systems
merge (such as when the organizations that use them undergo a merger/takeover). (Source: Wikipedia)
This article will address the large-scale data migration projects, where data is to be moved from source (old) system(s) to the target (new) system(s) on a one-time basis, usually as a result of
application or technology upgrade initiative.
The business objective of a data migration project is to move the data set of interest from the source system to the target system, while improving data quality and
maintaining business continuity.
In this article we discuss the proven strategies for executing large-scale data migration projects. The list has been compiled over years, after working on numerous large-scale critical data
migration projects. Each strategy can be adopted with some level of customization, as per an individual organization’s needs.
Strategy 1: Invest in Profiling Source Data
to uncover undocumented data relationships, data quality, data volume, data anomalies, etc. Data profiling essentially provides x-ray vision of the source data sets, which helps to understand the
strengths and weaknesses of the data sets. The investment made will have direct impact on the effectiveness of downstream processes and software code components. Also, it is important to define the
scope of the data profiling exercise up front to avoid any overspending on this task. Basic data profiling can be performed by developing scripts; however, for highly complex and large data sets,
using an industrial strength data profiling tool is worth the investment.
Strategy 2: Create a Data Migration Process Model
target systems. Create an elaborate process model depicting every step of the migration process. The artifact serves as the road map for moving data, as well as an agreement among the stakeholders
involved. The process model also serves as input to the downstream administration, configuration management and software development processes. The process model should have interim steps to
validate volume and quality of data that is flowing through the process. By having the embedded checkpoints, data analysts can make sure the exceptions are within accepted limits and there are no
Strategy 3: Define Roles and Responsibilities Up Front
imperative. The architects and the project manager should identify all possible roles and assign responsibilities to the roles as part of project planning. The project manager then should formally
assign these roles to all project staff members. By assigning roles and responsibilities up front, project leadership can ensure that entire data migration life cycle is supported with appropriate
accountability established. Conduct a formal walkthrough of the “Roles and Responsibilities” document to get buy-in from all stakeholders and project staff members.
Strategy 4: Divide and Conquer
logical groupings depends on the business context for data migration task. It is recommended to choose smallest data set (Hawaii) first and then move on to larger data sets (California). By
following such methodology, the team can learn and fine-tune the migration process early on with smaller data sets, thus minimizing the risks. Migration of each data set can be treated like a
release, which will help the team immensely in communication. Each release should be followed by a formal release evaluation step – to document and to educate – and refinements for next
Strategy 5: Invest in Technology/Tool Training
is new to such technology/tool, then it is highly recommended to invest in formal training for the staff that is responsible for development and execution of data migration code components. By
investing in such training, project leadership can minimize the risk associated with the learning curve involved in the project. Also during the training process, the team gets the opportunity to
establish relationship with vendor’s technical support staff.
Strategy 6: Conduct Performance Testing
the data migration projects have predefined and short time windows for moving data. Hence, it is imperative for the code components to have acceptable performance levels. It is highly recommended
to fully test the code components for performance at production scale. The project staff should continue to tune the software and/or configuration parameters until the desired throughput has been
achieved. The repetitive performance testing will also help staff get acquainted with the technology and the migration process.
Strategy 7: Have a Plan B
business continuity standpoint, it is imperative to have an alternative solution planned and tested before the project begins. The plan B must be formulated
with inputs from all stakeholders including business leadership, business users, operational IT staff and migration project leadership. The migration project leadership should get sign off from all
stakeholders for the plan B and communicate any changes thereafter. By communicating the plans and intentions to all stakeholders, the project leadership can ensure that all dependent business
processes are prepared for the change.
As we learned, most of the data migration projects fail for various reasons. One of the primary reasons for such failure is underestimation of the scale and complexity of the data migration
effort. By proactively investing in estimation and planning, IT managers can get good handle on the project. Data migration is a multidimensional effort, which can be time sensitive and mission
critical. By following these simple and proven strategies, IT managers can certainly improve the probability of success.
- Data Migration in the Global 2000, Bloor Research (September 2007)