I believe that data professionals must start adopting evolutionary (iterative and incremental) techniques, and better yet agile techniques, if they are to remain relevant within the modern IT
environment. The reason for this is simple: there is a very clear trend within IT towards such an approach. every modern software development process – Extreme Programming (XP)[1], Feature Driven Development (FDD)[2], Scrum[3], Dynamic System Development Method (DSDM)[4], Crystal Clear[5], Agile Model Driven Development (AMDD)[6], the Rational Unified Process
(RUP)[7], and the Enterprise Unified Process (EUP)[8] – takes an evolutionary approach to development. Data professionals
may not like this, and they may choose to fight against this trend, but the fact is that the IT sector is clearly moving towards evolutionary lifecycles. This change will take several years, will
prove to be difficult for some, but will result in greater overall levels of productivity.
It’s easy to say that data professionals need to work in an evolutionary manner, but they also need techniques which allow them to do so. Luckily many techniques do exist, if data professionals
choose to adopt them. And it is a choice, data professionals do not need to limit themselves to near-serial development techniques, they could in fact become effective members of modern development
teams. Data professionals can easily take an evolutionary, if not agile approach to modeling; they can refactor database schemas to improve the design; and the can take a test-driven development
(TDD) approach to database development to ensure quality.[9].
Evolutionary Data Modeling
There is no reason why you cannot take an evolutionary approach to data modeling[10]. The AMDD method describes principles and practices for effective
modeling and documentation, techniques which can clearly be applied to data modeling activities. At the beginning of a project your team should create a slim conceptual domain model, based on your
enterprise domain model if one exists, depicting the main business entities and the relationships between them. At this point in the project you would not fill in details such as data attributes or
responsibilities, this is something you would do during development in a just-in-time (JIT) basis. Your goal is to identify the landscape for now, trusting that you can fill in the details as they
are needed. Agilists embrace change – they know that any investment made early in the project to create a detailed model risks being wasted due to requirement changes. They also know that investing
time in up front detailed modeling pushes back the development of working software, increasing the risk to your project due to lack of concrete feedback. During development you use your conceptual
domain model to guide your physical object and data modeling efforts to ensure consistency between your schemas.
Database Refactoring
Just like developers have learned to refactor their object schemas, data professionals must learn to refactor their database schemas as well. In Refactoring[11], Martin Fowler describes code refactoring as a disciplined way to make small changes to your code to improve its design, making it easier to understand and to modify. Before
adding a new feature to your code, you ask yourself if the current design is the best one possible to allow you to add that feature. If it is, then do so. If not, refactor your design to make it
the best possible and then add the feature. The end result is that you keep your design the best possible, making it very easy to extend as needed.
Similarly, a database refactoring is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics. Your database schema includes
both structural aspects such as table and view definitions and functional aspects such as stored procedures and triggers. Database refactorings[12][13] are clearly more difficult to implement than code refactorings due to the increased coupling – a simple schema change could affect a score of applications which
access that portion of the schema, therefore you need to be careful.
Test Driven Development (TTD)
With TDD, you write a unit test before you write business code. By following this technique agilists build systems with nearly 100% regression unit test suites (user interface testing can be
tough). Yes, project teams must still perform other techniques such as system and acceptance testing as well, but having a full unit regression test suite in place is better than the vast majority
of traditional teams can claim. By having a regression test suite in place you can safely refactor schemas because you know you’ll be able to find, and then fix, any problems resulting from your
refactorings.
Got Agile?
Another important trend within the IT sector is the growing popularity of agile techniques such as XP, FDD, DSDM, Scrum, and AMDD. The reason for this is simple – they work very well in practice. A
simple definition of agile software development[14] is that it is evolutionary development performed in a highly collaborative manner, delivering
high-quality, working software which meets the highest priority needs of your project stakeholders. Agilists are building systems with incredibly high quality code, concise documentation, a nearly
100% regression test suite, and are doing so in a cost effective manner.
Many data professionals claim that agile techniques don’t take data issues into account, something that is personally frustrating considering my work in the Agile Data method[15]. Agilists take data issues into account, we just recognize that data is only one of many important issues that we need to consider. Agilists realize that data professionals
often have a very good understanding of enterprise issues, at least when it pertains to data, something that development teams can clearly benefit from.
There is an opportunity for data professionals to be valuable members of modern development teams, but you must adopt new ways of working. Most of your existing skills are still relevant; you just
need to apply them in evolutionary, and better yet, agile ways. The choice is yours.
[1] Ron Jeffries’ site, http://www.xprogramming.com, is a good starting point.
[2] The FDD site is http://www.featuredrivendevelopment.com
[3] Scrum is described at http://www.controlchaos.com
[4] The DSDM site is http://www.dsdm.org
[5] The Crystal Methodologies family.
[6] The AMDD approach is described at http://www.agilemodeling.com/essays/amdd.htm
[7] IBM’s RUP home page is http://www-306.ibm.com/software/awdtools/rup/support
[8] The EUP, an extension to the RUP, is described at http://www.enterpriseunifiedprocess.com/
[9] Test driven development (TDD) is overviewed at http://www.agiledata.org/essays/tdd.html
[10] An evolutionary approach to data modeling following the AMDD approach is described at http://www.agiledata.org/essays/agileDataModeling.html
[11] Refactoring: Improving the Design of Existing Code, Martin Fowler, Addison Wesley 1999, http://www.amazon.com/exec/obidos/ASIN/0201485672/ambysoftinc
[12] Database refactoring, the process, is described at http://www.agiledata.org/essays/databaseRefactoring.html
[13] A catalog of database refactorings is provided at http://www.databaserefactoring.com
[14] The Agile Alliance site, www.agilealliance.org, is the best place to start learning about agility.
[15] The Agile Data method is described at www.agiledata.org