Published in TDAN.com January 2002
Many companies now have web sites that generate customer data by the gigabyte. The application of “web mining” to find patterns in that information has created a very dangerous
by-product: red ink. This article describes how the well intentioned focus on customer response rates and similar dependent variables may cause devastating reductions in profitability
instead of increasing earnings. The reasons for this problem, as well as a step-by-step method to avoid repeating it, are not as mysterious as they might seem. By following same basic procedures,
both contract data miners and internal modelers will gain a much higher likelihood of success.
A new era for customer data
Accurate predictions of consumer behavior often depend on the type of data available to the modeler. When rich records of actual consumer behavior are available, powerful models can be created
fairly easily. However, most industries have never had that type of information. In fact, many companies have never had the opportunity to observe their customers’ behavior at all
(manufacturers selling through retail outlets, for example). That is now changing. With the ability to track customer web site visits through web logs, some companies are generating up to 1GB per
day of customer behavioral data.
The birth of web mining
Most web logs record every page viewed, what the visitor was viewing just before visiting the current web page, how long the visitor stayed, and the visitor’s next destination on leaving the
site. This data can be tremendously valuable.
Unfortunately, industry analysts have compared this activity to “trying to drink from a fire hose.” Instead of facing a drought of customer behavior information, they are facing data
repositories of truly daunting proportions. More frequently that not, these companies have not advanced their metrics past “page hits” and “click-through rates.” Data is
often simply stored offline in massive system back-ups, unused. Webmasters are frequently ill equipped to derive meaning from the data, yet are terrified to purge any of it.
This, of course, is where data mining enters. The technique of identifying previously unrecognized patterns is the very definition of data mining. Web mining, therefore, is simply the latest
buzzword attached to data mining projects using web-generated data.
A new challenge arises
There has been an industry-wide tendency to use this newly obtained data to solve problems of driving web site page hits, improving click-through rates, driving larger purchase amounts, and
increasing response rates to web-based solicitations. It has been a natural outgrowth of data mining efforts from other channels such as direct mail, cataloging and telemarketing. On its surface,
this is not only rational, but should improve the businesses’ performance. Beneath the surface, however, there can be severely detrimental effects from these efforts.
Two driving factors make this more problematic for web-based endeavors. The first element is the advent of “Internet time.” This simply means that programs can be deployed and expanded
much faster than traditional efforts. Speed means that small errors can lead to devastating results, and many times faster than the enterprise can respond. Second, web offers can be scaled to very
large efforts at a relatively low cost. For example, email campaigns with a link to the company’s site can be deployed with far less cost in most instances than sending out expensive
catalogs. These more efficient campaigns don’t face the usual corporate speed checkpoints that would call for careful analysis before scaling up to costly deployments.
Whether one or both factors are at fault, the result is the same. A poorly designed web-based promotion enhanced by advanced predictive modeling can quickly devastate a company’s earnings.
The reason: elusive pitfalls related to web-based promotions can actually cause losses at alarming pace. When the flawed programs are targeted with the power of web mining, the decline can rapidly
The earnings factor
The media is replete with documenting the stock market’s imploding prices for dot-coms. The consistent refrain is that once earnings were re-established as the primary basis for evaluating
companies’ worth, the dot-coms were shown to have the “Emperor’s new balance sheet”- profits were not only lacking, they were not forthcoming in the foreseeable future. This
focus isn’t new for most businesses – earnings have always driven valuations. However, the focus on other dot-com metrics such as GBF (get big fast), eyeballs, subscriptions and traffic
obscured the basics.
Yet despite the renewed focus on profitability, typical web mining problems often focus on response rate issues. Profitability is assumed to naturally follow from customer acquisition and responses
to offers. Unfortunately, that relationship is not only tenuous, it can actually work in reverse. Unaware data miners may be in the uncomfortable position of defending the flawed logic of,
“Yes, we lose money on each customer, but we’ll make it up on the volume!”
The response trap
One of the challenges of providing offers to consumers is the assumption that a consumer’s relationship with the business will be a profitable one. Unfortunately, that is not always the case.
Three examples of “problem responders” help to illustrate the issue.
First, responders to credit offers have dramatically higher credit loss profiles than the general population. In other words, the very people who respond to bank credit offers are frequently the
people a bank may desire the least. The result is that response models that aren’t adjusted to consider credit risk can actually bankrupt a financial institution. Several examples already
exist of lenders brought to the edge from these very types of models.
Second, sweepstakes promotions, which are widely deployed on the Internet (examples include the highly successful www.iwon.com search engine), draw
disproportionately lower income consumers. While the income distribution effect tends to disappear for contests with prizes greater than $10 million, those cases are rare indeed. For the rest of
the promotions, advertising rates for the entire site may fall, and the average defection rate of consumers will be higher.
Third, web promotions often employ the use of premiums or incentives to encourage specific behavior. This has two potential disadvantages. The first drawback is the fact that consumers that are
drawn by one incentive tend to readily defect for a competitor’s incentive, creating an “incentive race” that over time dramatically increases costs for acquiring customers, and
drives down the average value per customer attracted. The second disadvantage is that incentives tend to drive customers to deceive the company into giving several premiums to the same customer.
Again, these efforts may attract customers of less quality to the organization and to increase average defection rates.
In each of these cases, predictive response models do exactly what they are intended to do: identify profiles of highly responsive customers. Unfortunately, failing to recognize that profitability
varied widely among responders either limited, or eliminated, profitability.
If focusing on response is problematic, and profitability is a preferred target to optimize, then the dependent variable needs to shift. Depending on the client and the organization, there are two
ways this can be achieved. The first option is the simplest. Modelers can use an existing profitability measurement, or create a profitability variable, and use that as the only objective. The
chief advantage is that it is straightforward and easy to understand. If the model is being used against a single marketing campaign, or focuses on a single channel, this is usually the best
However, if a modeler has a client with a very tight focus on response, another alternative is likely to work better. Rather than create a single profitability model and potentially throw out a
reliable response model altogether, a combination model might be more appropriate. In simplest terms, the modeler builds a response model first and appends the response model score to each record
in the file. The profitability model is then applied to the new data set.
This second approach has its own advantages. First, it appeases the client’s desire to have a response model in hand. Many clients are so focused on seeing how a response model performed that
they are hard-pressed to hear about profitability first. This approach also demonstrates that the modeler truly listened to the client’s request. Beyond that, this allows the modeler to test
combinations of the two models, or to develop an optimization function to balance both approaches. If the modeler only has the profitability scores, no such function can be developed or deployed.
What is profitability?
Until this point, I have used the term profitability somewhat generically. While the generally accepted definition of profits is revenue less expenses, determining exactly what qualifies as revenue
or as an expense are can cause a surprising amount of disagreement in an organization. Accounting and finance departments usually have fairly strict rules for what “profits” include.
Considerations for depreciation, fixed cost allocations and accounting for “goodwill” all make the definition enormously complex.
Therefore, it makes sense to develop a proxy value for profitability, which will provide a useful substitute for profits. Without delving into a complex explanation of accounting theory, a few
basic concepts help. To begin with, the term “gross” is a lot easier to sell in an organization than the term “net”. Net indicates full, complete and final numbers, giving
it the same controversy level as the term “profit.” This is not what you want to be explaining in your presentation of the model. Therefore, it makes sense to keep the concept of
profitability in mind while using less controversial semantics. (Note: Synnetry.com, an e-commerce consulting firm, covers a more detailed review of calculating true profitability for a web
promotion at http://www.synnetry.com/busjust.asp.)
I recommend the use of something I call “gross contribution,” with the understanding that the accounting and finance groups can then further refine the calculations if they so choose.
The term “gross” implicitly shows that the calculation is not final. The term “contribution” comes from the accounting and marketing concept of “contribution
margin,” which basically indicates that allocations of fixed costs and depreciation are excluded. These adjustments should drastically reduce arguments over the calculation.
For the remainder of the article and ease of reading, think of “gross contribution” as our profitability proxy calculation.
Components of a profitability calculation
You should begin your calculation by deciding on the time horizon. Like most predictive modeling exercises, the further in the future you plan to predict, the less likely your score’s
accuracy is to be. Six months seems to be a reasonable time horizon to start with, and then you can check your success on longer time horizons. While the concept of lifetime value seems a seductive
option, it is rarely the best place to start.
Next comes the revenue side. Revenue consists of all the dollars the customer gives to the organization. The sales price obviously needs to be included, as well as any related items, such as fees,
shipping and handling charges, and finance charges, if appropriate. Include all the charges due from the customer, even if they have not been received yet: unpaid amounts will be deducted on the
Expenses tend to be more complex. We have already determined that we are not going to include fixed expenses (rent for the buildings and equipment, for example) because of the complexity. However,
variable expenses that can be attached to the program being studied and the products or services offered, will be included. Important items to consider, that tend to be ignored, are costs of
returns and their shipping and handling, incremental customer service calls generated from the promotion (up to 50 cents for an automated attendant call and $7 for a 3-minute customer live operator
call, for example), and increased coupon redemption would all be examples.
The remainder of the revenue less the expenses is the profitability proxy. Note that in many cases the gross contribution may be negative. This is to be expected, and should prove the
article’s earlier thesis of the dangers of ignoring profitability in modeling. For example, over 80% of the customers of many banks have slightly negative contributions, meaning that the
remaining 20% need to support the expenses of the entire base and all of the profits.
Putting it all together
To put this into perspective, an example from a fictional cellular phone company, AllCall, may help. AllCall recently ran a promotion to drive new consumers to its web site to sell them one-year
cellular phone contracts. The project began with a targeted email campaign with a link to a special offer page. The page provided the offer of a free phone and two months of free service in
exchange for the contract.
AllCall’s star modeler, Nora Network, first builds a response model. The result is a model generating 70% of the responses from just 30% of the leads. Verifying the results, Nora creates a
second model with gross contribution as the dependent variable. She then compares the income statements of the two groups. Several details emerge.
First, the top deciles for the response rate model have a 30% higher credit loss rate than the group as a whole. They also have a lower activation rate, apparently attracted by the premium of the
free phone, although they also have a lower total call volume. In essence, the group picked by the response rate model was sufficiently lacking in potential income that their returns were actually
In contrast, the profitability model’s top deciles have substantially lower response rates. The desired results come from the quadrant of high scores from the overlay of the profitability
model scores and response model scores. Filtering for this quadrant is logical, since only records likely to respond to an offer can become profitable. The difference is that the profitability
model “ferrets out” responders likely to generate losses or low profitability. It eliminates the “Trojan Horses” that can actually lead an organization to pursue red ink.
Response rate modeling is still valuable for web mining, as it is for other channels. It allows the targeting of marketing dollars and helps to profile those who are likely to become a customer.
Also, there are some cases where there just aren’t good data for profitability. For example, there remain some business strategies and test markets where “learning” replaces
earnings to help understand a given market.
Also, there can be a danger in making the time window too small. Frequently, products with a very long expected lifespan, such as home mortgages, may indicate negative earnings across the board in
the first 24 months, while a longer-term view might show something quite different.
Therefore, there are certainly cases where profitability should not replace response as the primary dependent variable. The real key is to make certain that earnings are to be the ultimate focus
unless there is compelling reason not to.
In general, the introduction of mining the data stores of web logs holds tremendous potential. Combined with the advent of “Internet time,” there is tremendous opportunity for
businesses to leapfrog their earnings. Predictive modeling will lead to many of those advances. And by focusing on the right objective, web miners will avoid red ink while driving the business
ahead of their competitors.