With the current economic downturn, it is apparent that organizations will have projects put on hold and their budgets cut or frozen. Companies will have to try their best to make do with what they
already have instead of experimenting with new technologies. The CFO’s office will look inwardly at how to achieve higher returns on assets employed rather than the ROI on new purchases. This
is just today’s harsh reality.
Software as a service (SaaS) and on-demand business intelligence (BI) will touch the world of data quality too as it becomes more and more important (almost mission critical) for BI projects to be
successful. In this article, we will talk about what is currently happening in the data quality (DQ) marketplace and explore the upcoming trends and predictions. According to a recent Gartner
survey, 28% of companies indicated that they have deployed SaaS-based data integration tools, and 24% of them have deployed SaaS-based data quality tools. In this survey, Gartner polled around 500
data managers in six countries. This proves that on-demand data integration and data quality software have shown signs of breaking into the mainstream.
This section talks about 11 predictions and observations about what’s going to happen in the world of SaaS and DQ.
Upfront data quality assessment using real-time integration: The face of data quality is changing. The focus will change from reactive data quality to proactive data quality
assessment and prevention through point-of-entry data quality applications. Data quality assessment (DQA) is defined as inspection, measurement, and analysis of data to help business users
understand the defects in the data and the impact of those defects upon the business. It will involve identification of inaccurate, inconsistent or duplicate records, a focus on DQ then software,
hardware and administration issues, scheduled data assessment, profiling and reporting, DQA dashboard to list top data issues, and pattern recognition to make better business decisions. The
philosophy of upfront data analysis will make it easier to conduct critical downstream cleansing. What this means is that data quality tools will no longer be batch-oriented but are predicted to
become more real time.
More experimentation: The SaaS deployment model allows low-risk experimentation with emerging technologies including the area of data quality as these technologies are usually
available over the Internet and require no upfront on-premise hardware support. Also, as described in my September, 2008 article On-Demand
versus On-Premise BI, SaaS-based pricing models will probably rarely ever involve long-term contracts or license investments. For example, a customer might be switching to a SaaS-based
CRM solution like Salesforce.com, and they would like to implement address validation at the point of entry into their CRM applications to gain the maximum benefit from their CRM
DQ and metrics-based analytics: In my June 2008 article Overview of On-Demand BI and its Key
Characteristics, I talked about a key trend – the usage of a metrics-based approach to implement business services and functions. If a company wishes to implement such an approach
to run their company, the reliance on DQ becomes even more powerful as one cannot run a company based on metrics backed by inaccurate, inconsistent or incomplete data. There is a common
misconception that the analytical reporting needs of SMB (small and medium businesses) are less complex than the bigger enterprises. In this economic downturn, small and mid-size businesses will
rely even more on technology and information to run their businesses in a lean and mean fashion.
Business units getting self-reliant: A lot of SaaS-based data quality deployments will be initiated directly by business units. Let’s take the example of a VP of Sales who
has gone on his own to implement, for example, a SaaS-based CRM system like Salesforce or an ERP system like Netsuite. He or she would not want to rely on IT to implement data quality initiatives
on their CRM system but will most likely seek another SaaS vendor that can provide point-of-entry data quality benefits to improve the quality of their CRM investment.
Hybrid implementations: Another trend that I had talked about in my June 2008 article Overview of
On-Demand BI and its Key Characteristics was the emergence of hybrid architectures where in on-premise data, applications and processes are linked through web services-based integration
APIs (application programming interfaces) to SaaS-based applications. With the current economic downturn, conversion projects to convert on-premise applications to SaaS-based applications are
just not going to happen at the pace that we initially thought they would happen. Instead, what we would see is how the on-premise investments can be leveraged and combined with new SaaS-based
applications to derive the maximum business advantage. And besides the security concerns that some enterprises would have in keeping some data outside the firewall would also necessitate some
applications to be staying on premise. On-premise BI has been a proven deployment model for quite some time, and it would not be easy to just replace it with a SaaS-based application.
SaaS-based DQ tools to catch up with on-premise tools: Vendors will try to offer the same capabilities in their SaaS-based offerings as their on-premise versions. The gap between
the 2 vendor offerings will steadily decrease. Once this starts happening, I feel that the adoption of SaaS-based quality tools will increase at a rapid pace.
Support for Internationalization: As global companies start adopting data quality in a big way, the need for internalization of data will become all the more prominent. Data
quality applications will not only need to support multi-byte character sets but also localization so that they can continue to use the applications in their native language.
DQ tools used by business analysts: So far, data quality tools have been primarily focused at a technical audience that profile and/or cleanse data. In the next 5 years, one
would see that data quality tools will get more and more adopted by people closer to the business who want to lessen their dependence on IT and want to be more self-sufficient. This would involve
business analysts or SMEs (subject-matter experts). As a result, the tools have to be built for a non-technical audience. There would be a huge focus on user experience. Products targeted at a
non-technical audience will become non-existent if they are difficult to use and require extensive training.
Emergence of broader DQ software suites: Till about a couple of years ago in 2006, there were niche players that could be called as specialists in certain specific areas. In the
next 5 years, only those vendors will survive that have a broad range of DQ capabilities from data profiling, data cleansing, standardization, matching, DQ dashboards and scorecards, and identity
resolution. And needless to say, these DQ software tools will need to provide data integration capabilities too as data integration and data quality have almost become inseparable. I believe that
standalone data quality vendors will almost become non-existent in the next 5 years although there would continue to be some niche players like startups that continue to innovate in that
Relationship between MDM and DQ: Every master data management (MDM) initiative will begin to have a strong DQ component. MDM hub implementations will most likely involve some
batch DQ capabilities like processing of quarterly feeds from third parties like D&B, nightly updates to the MDM hub from source systems, initial load from source systems to the MDM hub.
Mergers and acquisitions will become the biggest driver for MDM implementations. Data governance approaches are very essential to be followed when integrating data from acquired companies. It
does not really matter whether it is an OEM arrangement or a capability inherent within the MDM product itself, but what matters is the fact that MDM and DQ will be tied in the hip and there is
no getting away from it. Some say that the DQ engine will eventually drive the success of the MDM initiative. The reality may not be far off from that assertion. The focus will extend beyond
customer data. Data quality for non-customer data like financial data or product data will gain prominence as DQ tools become more domain-agnostic.
Shared business rules: As data integration, MDM and data quality products get more integrated, they will all start sharing a common business rules platform that allows them to
work in a common workspace and share the same metadata across all products. It should be possible to leverage artifacts created in one environment across products and technologies.
Data quality is moving beyond the IT domain to become a part of the business agenda. As the strategic importance of information becomes more apparent, IT and business will become more than
partners. They will now realize they must be joined at the hip. Overall, Gartner predicts that the SaaS-based software market, including data management technologies like data integration and data
quality, will reach $19.3 billion by the end of 2011, up from $6.3 billion in 2006. So it is clear that software vendors will focus their energies in this space to come up with better and more
integrated offerings. This is all good news for the customers as there would be healthy competition amongst the vendors as they compete to grab market share. We are all aware of the steps that the
government is taking to financially bail out industries such as financial services. As this happens, compliance and regulations will be bought to the forefront as taxpayers feel the need to know
what is going on with their money. And this will require data to be accurate, consistent and timely. The SaaS approach can be a good model here to enable businesses – big and small – to
adopt and implement data quality improvements in an enterprise-wide fashion albeit starting with small, focused investments.