In recent years, there has been a great deal of interest and debate concerning Agile development methodologies, particularly regarding the role that data professionals can (and should) play in Agile Development. Some Agile practitioners, such as Scott Ambler, view data professionals as a necessary evil and suggest that developers try to work with them as best they can. Others (check any development blog or discussion group) regard data professionals as an unnecessary evil, and advocate circumventing or eliminating data requirements whenever possible. There is a definite opinion in the development world that data engineers and DBAs simply impede the progress of application development projects and increase costs, without adding much value.
In my current position as a consulting DBA to Agile development teams at a large global manufacturer, I’ve spent the past several years learning how to support Agile projects in an effective manner, adding significant value to the development effort without sacrificing critical data standards – increasing the velocity of projects while still ensuring that our databases are robust, secure, business-centered and able to support all the information needs of the company.
First, a short primer, for those of you who are new to Agile: Agile Development (AD) is the latest in a series of iterative approaches to application development that have been suggested as an alternative to the traditional “waterfall” (or Software Engineering) approach, where all the requirements are gathered before starting analysis, all analysis is done before starting design, and all design is done before starting to code. In Agile (as with other iterative methodologies), requirements gathering, analysis, design, coding, testing and deployment activities take place more-or-less simultaneously in a series of short (2-4 week) development cycles, each of which ends with a working application, having some subset of the necessary functionality that can be tested and evaluated by the business users. AD cycles (usually called “sprints”) always begin and end with a working product.
Specifically, the process (as we do it) works something like this: there is an initial series of meetings with the business users (called “workshops”), in which a preliminary set of business requirements are gathered and documented in the form of “user stories” (“As a [business role], I need to do [some business function] in order to achieve [some business value].”). Each of these stories is analyzed just enough to determine its relative complexity, business value and a rough estimate of the time required for completion. The stories are prioritized according to their importance to the business (some will have to be included because other stories depend on them), and a subset of them is selected for the project (usually determined by the available time and budget). This subset of user stories is called the “backlog.”
For each 2-4 week sprint in the project, a certain number of user stories are selected from the backlog to be worked on in support of a stated goal for that sprint. The goal is to create a working application that has some particular business functionality. Each story is analyzed, a solution is designed and the design is then handed off to some members of the project team (usually a pair of developers) for coding. The completed code is tested, and reviewed for quality and completeness. After all the stories in the sprint have been completed, tested and validated, the application is handed to the business users for testing and feedback while the next sprint is begun.
So the question is this: Is Agile Development a methodology that the data community can support? In a sense, the question is academic since Agile is the way application development is being done and will continue to be done for the foreseeable future. But, in spite of the opposition and ill-feeling that seems to exist between data professionals and Agile developers, I believe that the Agile approach has a number of positive advantages:
- It reduces the risk of project failure. Business users are continually involved in defining and evaluating the product, and the product requirements can be adjusted on the fly in response to changing business needs. Too many projects fail because by the time the users get to see the finished product, it either isn’t what they wanted or it isn’t what the business needs now.
- It enables resources to be quickly adjusted in response to project needs. The need for additional resources can be more quickly identified and added during difficult sprints, with less time required to bring them up to speed.
- It increases product quality. All too often, in traditional approaches, testing isn’t begun until the coding is complete, and sometimes it isn’t done at all. I’ve seen projects that spent a year and half being debugged after they went live in production! Whereas, in Agile, early and continual testing is an integral part of the development process.
- It enables project managers to keep better control of a project’s scope, risk, deliverables, progress and budget. All to often in traditional approaches, the project manager doesn’t see the danger until the ship is on the rocks.
- It emphasizes a collaborative approach to development that brings together people from different disciplines, improving communication and reducing misunderstanding and “silos of information.”
So here’s the good news about Agile Development – it’s a proven methodology with many positive and valuable benefits, and it is possible (indeed, essential) for data professionals to contribute significant value to AD projects without sacrificing the essential requirements of data quality, security, maintainability and reusability.
Now here’s the bad news – it’s a lot of work! Agile is an extremely fast-paced and highly collaborative methodology, and it requires a much greater degree of involvement and commitment than traditional approaches. Therefore, Agile requires a different set of processes (and tools) than traditional approaches, as well as a different mind-set. We don’t have to abandon the work we do (which contributes immense value to our companies in the form of high-quality, business-centered, reusable data); we just have to do it faster, smarter and more interactively.
In the upcoming series of articles, we’ll be exploring a number of crucial topics regarding Agile data development, including:
- The critical distinction between the logical and physical views of data
- The difference between logical and physical design
- How much design (and implementation) should be done, and when
- Refactoring at the logical and physical levels
- Normalization and denormalization (when, where, why and how)
- The importance of data virtualization
- Handling rapidly changing requirements
- Tools and techniques for rapid database development
- Organizational and cultural issues around Agile Development
- Developing an “Agile Attitude”
I’d like to make this a dialogue, so please feel free to email questions, comments and concerns to me. Thanks for reading!