One of the most exciting developments in the banking and insurance industries in the 1990’s has been the ability to develop Customer Relationship Management (CRM) systems from the abundance of
economic, demographic, lifestyle, psychographic, DSS and Internet data. With a plethora of product offerings including, multiple lines of insurance, families of investments, and at home banking
options, just to name a few, organizations are scrambling to develop a “relationship” with a targeted audience. At the core of these efforts is the need to establish a clear view of the customer,
be it individual, household, business or any combination thereof. Understanding the entire customer relationship and all of the “touch points” in which customers make contact, enables financial
institutions to set about acquiring new customers, offer timely incentives for increasing business with established customers, and plan for long term relationships with the most profitable
CRM is a process that incorporates a set of technologies that must be repeatable and consistent in order to facilitate the numerous touch points that customers engage in on a daily basis. The
systems in place at each touch point must be capable of identifying and matching customer records in order to integrate the data from and to multiple sources. As shown below, customers are often
represented in a variety of forms depending on the relationship within each organization.
Data from each source differs in format, degree of completeness, accuracy and standardization (i.e. last name, first, MI). Additionally, there are also enormous complexities inherent in data types,
both customer and generic (i.e. telephone numbers, product codes, email, etc), that aid in the identification of relationships between customers, household and businesses. It is clear that the
success of the CRM program is linked and enormously dependent on the quality of the data and the ability of the organization to share consistent, accurate information across the enterprise.
There are both data and technical challenges inherent in implementing enterprise-wide data quality (EWDQ) in support of CRM systems. Most common among the data challenges, aside from those
mentioned above are, varied standards, mistakes, misspellings, the misrepresentation of homonyms, meta data analysis, recognition of legal entities and missing and incomplete data.
To be effective, EWDQ as a process must meet a number of technical challenges. The process must work across platforms and information architectures (MVS, UNIX, NT etc.), must be adaptable and
capture knowledge from an organization, and not scare away users by being difficult. A successful data cleansing solution should be capable of being implemented in high performance applications to
cleanse, identify and match records in preparation for loading to data warehouses or operational data stores, for application conversions like SAP, entity management or in an on-line (active data
warehouse) environment. In addition, more and more organizations are operating on a global scale that necessitates and understanding of international data sets. Above all, organizations must be
able to depend on a data cleansing solution that establishes standards that are portable, with output that can be consistently applied throughout the enterprise. Effective data quality is the
result of an architecture designed to help solve the complex problems inherent in legacy system data, while also serving as the operational gatekeeper for incoming data. This provides customers
with the ability to rapidly implement long lasting, sustainable, data quality solutions that cleanse and recondition large volumes of global data from multiple sources.
The Need for Standard Processes
Organizations strive to develop processes that ensure efficiency throughout their enterprise. It only makes “cents” to ensure the same quality with respect to their data. Larry English described
the real cost to organizations in DM Review, “The costs of non-quality result from the scrap and re-work of defective products and services and in customer lifetime value when non- quality causes
one to miss a new opportunity or to lose a customer”. English also believes that “Ensuring that a process is in place requires among other things, several key characteristics;”
- The process must be defined
- It must be repeatable
- It has a process owner
- It results in reusable and reused data
- It captures data as close to the point of origin as is feasible
- It incorporates controlled evolution
- It involves developing the data model to support the major information views across the enterprise
Few data entry processes are to ensure data quality. In particular, many older systems tried to save some time and effort by grouping all the name information into one field. However, even a
structure outlined with the above rules is not foolproof. For example, should the name of a product be written as “PC” or “Personal Computer”? We can see that even this simple, valid variation
between two alternatives will cause problems later on in merging data, and highlights the need for enterprise level standards which would avoid this situation. Of course, it is possible to write
routines that will map these two variations of the spelling of the product name into a single consistent structure.
But if there are a dozen data marts in an enterprise, does it really make sense to have each of the extraction and transformation processes for each of the data marts perform the same mappings on
the same data over and over again? Or does it make more sense to perform this same edit check at the time of data entry, and enforce a corporate standard at the beginning of the process and save
the effort of developing the code and reusing the code on a regular basis for multiple applications? The message is simple: Don’t keep cleaning the same data at the back end: fix it at the source.
Lastly, it is imperative to ensure that the cleansed data satisfy the users and results in the ability of the users to make mission critical decisions.
Besides the obvious benefits of avoiding the tedious data reengineering phases of data warehousing, there are numerous other benefits for the enterprise to adopting a process oriented approach to
data cleansing. First of all, we must remember that CRM is only one of many initiatives underway in large organizations today. Operational Data Stores, eBusiness and on-line applications, and
analyses into existing databases, and uploads and extracts into new applications are other areas in which the data quality issue rears its ugly head. “Garbage-in Garbage-out” works as well in a
simple query into a sales database as it does into a galactic warehouse project, and the ramifications of a bad decision based on bad data are equally threatening. The need for clean data
throughout the enterprise is universal.
As financial organizations become more globally focused and centered around CRM systems for growing their business, the impact of data quality becomes more important and obvious. EWDQ solutions
have a proven track record in accomplishing real data quality standards that impact both customer relationships and the bottom line.