Proven Strategies for Executing Large-Scale Data Modeling Projects

With the proliferation of enterprise class applications such as ERP, CRM, SFA etc., development of grassroots applications is less commonplace. The industries deploying these enterprise applications
also benefit from the research and development that goes around these products and the resources available for such applications. On the other hand, with business innovations, new business models are
deployed requiring innovative business processes. These creative business processes require new ways to manage data for the organization. And IT departments for such industries are required to model
the data for these enterprise-class applications. IT managers also face at least two other scenarios that can generate the need to model data from ground zero:

  1. Technology refresh for legacy applications
  2. Consolidation due to mergers and acquisitions

These enterprise-class applications tend to have large data footprints, which require large-scale data models.

When IT faces such an immediate and large-scale data modeling task, IT managers have two options:

  1. Develop data models in-house
  2. Outsource a component or the entire project to a vendor.

Since data modeling requires significant business knowledge and is an extremely interactive process, IT managers favor the first option – in-house development. Business leaders are
required to market the products and/or services in a very short time frame. IT managers are challenged with aggressive deadlines and limited budgets. Missing dates or creating poor quality data
models can have disastrous effects on the business. Data architects and or modelers can certainly benefit from industry-proven strategies for executing large-scale data modeling projects. This
article is an introduction to those strategies that IT managers can find useful.

Proven Strategies

Strategy 1: Define data modeling standards up front.

An up-front investment in defining data modeling standards such as data domains and naming standards can have a tremendous impact on the quality of the end product – the enterprise data
model. Project leadership should also make sure that these standards are communicated in a timely manner to all team members and are followed by every team member.

Strategy 2: Train business analysts on the fundamentals of database design and data modeling techniques.

Business analysts are your stakeholders in the project. By training business analysts in the fundamentals of database design and data modeling techniques, you can eliminate the communication
barrier between the business subject-matter experts (SMEs) and data modelers. A short course in database technology and data modeling will also help establish a long-term productive relationship
between IT and business groups.

Strategy 3: Divide and conquer the entire task into sub-task centered on business areas.

Like any other large-scale task, the data modeling task also should be divided around the defined business areas. The size and complexity of the business areas depends on the application itself,
but dividing the task into several teams allows you to achieve project deadlines for intermediate and final deliverables.

Strategy 4: Capture and store data architecture-related artifacts in relational format.

By capturing data architecture artifacts (e.g., data element usage by business processes) in relational format, you can import these artifacts and avoid data modeling from ground zero. By importing
the data element lists into the modeling tool, you can jump-start the data modeling process. This strategy also ensures high-quality work products since manual data entry into the data modeling
tool is minimized.

Strategy 5: Use the database to QA the data models.

Periodic review and assessment of the data models is an essential step in the project. Manual review of large-scale data models can be time consuming and may not be feasible. Certain basic and
repetitive checks can be automated (e.g., audit column in tables such as CREATE_DATE is designed as mandatory column). First, you need to export the data model into relational tables and then write
and execute SQL statements to QA the model.

Strategy 6: Push modeling for all business areas through various stages simultaneously throughout the life cycle of the project.

The data model evolves through numerous stages before it is accepted as a production-ready product. The number of stages and characteristics of stages depends on the complexity of the application
being developed. By pushing all business areas through the stages (e.g., Ballpark, Conceptual, Logical and Physical) at the same time, the chief data architect can ensure the consistency across the
business areas. This strategy also helps the project manager in reporting progress and related release management issues.

Strategy 7: Follow release management process to manage changes made to the data models.

During the project, it is critical to document and track change requests made by the stakeholders. The change history will enable the chief data architect to analyze the changes requested later on
in the project life cycle. It will enable the chief data architect to answer why a change was made, who requested the change and when the change was actually implemented.

Strategy 8: Build data glossary up front using business terms and industry naming standards for data elements and data entities.

In order to follow standard naming conventions across the business areas, it is imperative to have a single glossary of terms with the full name and its short abbreviations. Having a single
glossary at the early stages of the project will ensure that all teams that are developing different business area data models will adopt the same naming conventions.

Strategy 9: Use a data modeling tool that has a rich set of metadata exchange capabilities.

The metadata associated with the data model is seldom locked inside the data modeling tool repository. The metadata must be exchanged with other tools for integration and reuse purposes. Choose a
data modeling tool that follows industry standard XML format(s) to import and export metadata from its repository.

Strategy 10: Leverage industry-proven data model templates for business data concepts to jump-start the project.

The basic industry-standard data models are available from numerous industry sources including publishers, software vendors and research organizations. By having basic entities and the template
data models available, data modelers can focus on the complex business data requirements.

Strategy 11: Actively manage the project’s scope.

Quite often, the scope of the application and subsequently the data requirements for the application spiral out of control, unless it is simply a technology refresh project for a legacy
application. In such cases, the data modelers face never-ending data requirements. The data modelers and the project manager should actively manage the scope of the data modeling effort throughout
the project life cycle.

Strategy 12: Work closely with the business users and business analysts throughout the project.

Many small and large data modeling projects fail simply because the data modeler develops the models in a vacuum. Frequent interactions and reviews are required to validate that the data
requirements and business rules are captured in the data model. The business users and the business analysts will reject the final product unless they are consulted along the way and feel like they
have invested something in its development.


Large-scale data modeling projects can be intense IT endeavors for new industries, and these projects can be intimidating for established ones. The strategies listed in this article can help the data
architects and lead data modelers get a good handle on the project. The strategies are technology/platform neutral and are crafted around people and processes involved. By following these simple and
proven strategies, data modelers can focus on the core task of – understanding and modeling data requirements. And IT managers can certainly improve the probability of success for such high
impact projects.

Share this post

Satyajeet Dhumne

Satyajeet Dhumne

Satyajeet is an experienced consultant in the fields of data warehousing, business intelligence and data management. He has more than 23 years of experience in the information technology industry, and for the past 12 years he has focused on business intelligence, data warehousing and data architecture. Satyajeet holds M.S. in Management of Information Technology from McIntire School of Commerce at University of Virginia. You may reach Satyajeet via email at

scroll to top
We use technologies such as cookies to understand how you use our site and to provide a better user experience. This includes personalizing content, using analytics and improving site operations. We may share your information about your use of our site with third parties in accordance with our Privacy Policy. You can change your cookie settings as described here at any time, but parts of our site may not function correctly without them. By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.
I Accept