*Published in TDAN.com April 2002*

Most of us study bits of elementary Physics as we get through our schooling. A few of us find it fascinating, a few others reckon it too idealistic with bodies continuing to move forever purely by

divine miracle. Many more find it awfully boring with so many laws to remember and equations to derive. Indeed, very few of us have anything at all to do with much of Physics as realistic science.

Yet there are too many principles and practices initiated in Physics that have value in virtually every sphere of our life, and Information Industry is by far among the greatest beneficiaries of

Physics.

Back in my school days I was much amused by concept of dimensional analysis. The greatest merit of this was to be able to perceive most quantities and their units as derived from a small number of

Fundamental Quantities. In short, it meant that whereas we often try to make life complicated by assuming quantities that we do not fully understand or relate, the universe is essentially built

around only a handful of these that are elementary. Conversely, based on our mastery on manipulating those handful of elementary things, lies our mastery on everything. From that moment onwards I

liked Physics for being the simplest of sciences. Later on, as I studied Chemistry, I found the same concept repeating where everything was made up from a few chemical elements. The two together

have emphasized that complex looking things are essentially made of simple bits and it depends on our ability to combine simple things in proper quantities and context as to how we make the most

sense out of the seemingly complex world.

Why have derived quantities?

The question may be broken into two:

- Why reckon in quantities that turn out to be derivable?
- Why derive these quantities?

Why reckon in quantities that turn out to be derivable?

This first question is not that difficult to answer, for there are many of these quantities that we actually ‘feel’. Take example of our routine of traveling to and from our office (not

all of us have e-office with flexi-hours (or further, virtual hours!), however often may we wonder – why not?). Ultimately, the objective is to travel so as to reach from one set of latitude and

longitude i.e. position, to another. This corresponds to a set distance following the shortest available route (e.g. a straight line for good crows on flat earth); but since we have to do this in

such time as to make the most out of our morning, avoid a boring crawl and still meet the office schedule, we are a lot more concerned about the speed that we end up traveling at. Indeed, we are

often willing to travel extra distance if it means better speed and thus more probability of making the most out of our morning. So speed, though derived, is more important to us that distance.

Further, imagine a route full of ups and downs, bottlenecks and fast carriageways where the overall average speed of travel is a bit higher than on a flat country road in ‘good-enough’

conditions and with modest traffic. Are we quite certain that we shall always go by the path allowing greater average speed and thus perhaps a bit less time? On that path the car has to spend a

fair amount of extra energy trying to accelerate after every bottleneck and give in gas on every uphill slope. We also need to spend a lot higher amount of our energy fiddling with the pedals and

being extra vigilant. Perhaps a bit more distance, a bit more time, a bit less speed are all fine if they mean considerably lower acceleration / retardation and thus lower usage of energy. So

energy consumption and acceleration / retardation, though involving derivation a level deeper, are more important in our consideration.

We can go on and on thus, but the two examples above have adequately demonstrated why we are often more concerned about quantities that turn out to be derived. To Manage our day, our Information

System seeks derived quantities rather than fundamental ones. We have just seen a real-life example of what drive the simplest of Decision Support Systems – appropriate quantities, however

intricately derived they may be, those can be measured and mapped most directly to our end objectives. In our example, the goal is to make the most out of our morning drive and this we achieve by

measuring the energy consumed by car and by us, which we try to put to minimum in practical terms.

The next question is – how do we measure energy? We seldom measure it in its units such as joules (wonder how many of us have heard of this) kWhrs or horsepower-hours. We measure consumption in

context of a car as mileage (distance per unit volume of fuel), brake-ware (width per unit time / distance) and service frequency (as incidences per unit time or may be as ratio of down time to

operational time). Those few of us who care enough, will probably also measure emission of pollutants (as mass per unit distance traveled?) Further, whereas there are ways of converting most of

these quantities into “running cost”, we are not always satisfied by that simple conversion, since our beloved car needing frequent servicing, a need for more frequent fuel stops and

the environmental value have implications (that may be made tangible with some thinking) beyond money. We thus often tend to see their trend separately and then work out the best option. All of us

are keeping track of these derived quantities to make the decision that appears most appropriate to us. So commonplace and often subconscious is this reckoning that we’ll ridicule it if

someone were to call this our “Route Management Intelligence System”. But just add bits and pieces to this problem that are specific to individual road users in our area and aggregate

their metrics. What we suddenly have is the Traffic Management System of the area, a system which is beyond any doubt complex in itself and linked intricately with that of other geographical areas.

To attempt to describe this system we’ll readily volunteer to use much of available jargon in the world. This time around therefore we fully appreciate the enormity of culmination of our

individual little exercises with the derived quantities and we’ll readily cut our heart out for anyone who could solve this for us. (In fairy tales they still give away half the kingdom and

the princess’s hand!)

Why derive these quantities?

Probably this question will become less poignant or indeed less of a question, with the exercise that we have gone through. There is no doubt that we need to do our metrics in the most appropriate

quantities and to make the most out of them we need to understand their relation to other quantities. Debate may still be rife as to which quantities should be fundamental. Probably this debate

will continue eternally. Many will suggest that fundamental quantities Are Fundamental, they come by intuition, as say – distance and time in our example. I know however from the same Physics,

where all this started, that intuition is neither universal nor unique.

Back in those good old days when the intelligentsia were supposed to be wearing the weirdest of attire and hairstyle, perhaps to make thinking by far the easiest thing in life, there existed two

“Physics”, the Electric Physics and the Magnetic Physics. The electrical folks reckoned that Charge was the fundamental quantity; whereas the magnetic folks insisted that Current (which

later turned out to be flow of charge per unit time) was The Fundamental Quantity. By the time the two branches merged to form Electromagnetic Physics, the “magnetic” folks had far

excelled the electric folks in their influence on industry. So, much against the intuition of a modern physicist, Current continued to be a fundamental quantity and Charge got to be derived from

it. Whereas theorists take issues like these to get at each other’s throats from time to time, in practice the arrangement goes well, since even today we, Physicists or Realists (!), measure

current a lot more than we do charge, making current effectively a phantom commodity.

So, in the end, all that can be said about fundamental quantities is that it is good if they are tangibly thought of and expressed, are measured fairly comfortably and accurately in their own

terms, and provide building blocks for other quantities that are measured / expressed with them as constituents. They then, by allowing quantities to be derived from them, provide an effective way

of relating these derived quantities with each other.

Dimensional Analysis

Perhaps many of you have begun wondering – all this may be well as it is, but where the hell does the title of this article come from? Well, it comes from everything that we have been talking about

so far. In our little exercise with the car, we talked about distance (or length) and time as the fundamental quantities. We next talked about speed – distance (change of position) per unit

time and acceleration / retardation – change in speed per unit time. Further, to keep matters simple, we considered energy without considering the mass of the laden vehicle, i.e. effectively,

we talked about energy per unit mass.

As we were proceeding with the argument, things up to acceleration were all simple, since speed was time derivative of distance and acceleration was time derivative of speed. Further, talking about

energy, we realized that more is the acceleration, more is the energy needed. But we had no way of telling whether doubling acceleration meant doubling energy consumption and further, for what

matters us most, whether doubling energy consumption is doubling the expenditure. Is it not obvious? No, it isn’t. Indeed those of you who have studied the current and power rating of an

electrical gadget may actually have noted that if current rating doubles, the power rating quadruples, and so does the amount that we shell out while using it. Put in general terms, expenditure

goes as square of the current rating of the appliance. Now that’s frightening, and so is it for Enterprises whose fortunes depend on answer to one question: What other quantities in what way

and in what proportion change quantities that matter most to them?

Physics answers this question through dimensional analysis. As Physics does, let’s represent distance (length) by L, mass by M and time by T. Speed is distance per unit time, i.e. L ¸

T, which may also be represented in power notation as L1T-1. Speed is therefore proportional to [L1T-1]. This is spoken in Physics as speed has the dimensions [L1T-1]. Likewise, acceleration /

retardation will have dimensions [L1T-2], the force necessary to produce this change of speed will have dimensions of product of [M1] and [L1T-2], i.e. [M1L1T-2]. Further, the energy needed to do

the work of applying this force over a distance [L] has dimensions [M1L2T-2] and the power of the vehicle, i.e. ability to do work in unit time, has dimensions [M1L2T-3].

If you are still tuned, you are likely to wonder – we did all this ourselves, so where have the dimensions that we so painstakingly derived, been put to any use? Wherever you like! E.g. you

want to know – what will happen to energy consumption if you drive at double the speed by slamming the gas hard? We are talking about energy [M1L2T-2], i.e. [M1][L2T-2], i.e. [M1][L1T-1] 2 and

speed [L1T-1]. The answer is plainly obvious, energy consumption will go up as square of speed and so it will be quadrupled when we double the speed by slamming down the gas. Hardly worth the

wastage! You also see that unlike with respect to speed, the energy consumption will go up proportional to the loading [M1] of the vehicle and not to its square. So qualitatively, speeding is doing

a far greater damage to our economics than loading does.

Now try and remember your miserable self, restlessly preparing for the driving theory test, trying desperately to remember the stopping distance of a car at different speeds. It was easy to follow

that the total distance will be sum of “thinking” distance – the distance traveled while your reflexes slammed the brakes and braking distance when you and the brake were doing

the best possible. Whereas this sum was easy, remembering these two distances for at least 6 speeds was near impossible. But did you need to remember them all?

For thinking distance, we are talking about distance [L1], i.e. [L1T-1] [T1] at specific speed [L1T-1] and reflex time [T1]. A look at the dimensions and straight comes the answer – the

thinking distance varies directly as both the speed and the reflex time. So double the speed and for the same attentiveness on your part, the distance doubles. Likewise, braking distance [L1], i.e.

[L2T-2][L-1T2], i.e. [L1T-1]2 [L1T-2]-1 varies inversely as braking retardation [L1T-2] and directly with the square of speed [L1T-1]. Thus double the speed and quadruples the breaking distance.

The important point drawn home is that whereas there is no excuse for not being attentive behind the wheel, higher speed is far more disastrous.

As manager of the Enterprise of driving you car, you have just made two important decisions based totally on dimensional analysis:

- For fuel economy, you should remove unnecessary load from you vehicle, but far more importantly you should not slam the gas and speed up too far.
- For a safe drive, you should be alert while behind the wheel, but once again, you should be extremely careful about speed, as the positive effect of the best reflexes can not compensate for

negative effect of speeding.

How does this fit with BIS?

If you have worked in an environment where the jargons MIS, DSS, BIS, OLAP have been within earshot, then you have probably also heard about data-warehouses and perhaps know a lot about them.

Further, the word dimension hasn’t been as unknown to you and perhaps you have done a substantial amount of cubing yourself. In the next two sections, we’ll take a recap of what

fundamental things do organizations look for in their Business Intelligence Systems and the way they tend to get close to their solution, before pointing out what they do not get through the way

they work.

What do organizations look for in their BIS?

In all honesty, there is no all-encompassing answer to this. The closest that we can go to generalizing is that they look for metrics of their “performance”. A concept that usually has

facets like efficiency, effectiveness, growth, market dominance, stability etc. Many of these are perceived as change in various quantities translatable into monitory equivalent, over various

scoping factors such as period of time, geography, population, etc. and competitive factors such as competitor popularity, standard of living, etc. The organizations are therefore looking first to

determine what facts are important to their success (their CSFs – to throw another jargon) and having perceived or at least narrowed down their choice of important facts, they go about relating

them to other variables in the environment to find the best movement of these variables in the organizations’ interest.

How do they go about this?

They go about this in just the way they think. They first determine facts that are important to them. They then list the variables that they consider influencing the facts. Next they wait and wait

and wait and if they are not doomed in the meanwhile, they compile a chart full of the fact(s) that matter to them, with values of variable(s) that may have had anything to do with the facts. They

then group each variable for different granularity and match the fact(s) against each to see if, by any chance, a fact shows a trend for change in value of the variable. If it appears to, they make

further business decisions based on this and hope that they will prove to be in the right direction, if not necessarily accurate. Whether the physical arrangement of data is redundantly

multidimensional (typically for frequent and complex analysis of modest size of data) or relational (for modest querying and analysis of large and / or legacy data) or a mix of these, and whether

or not the dimensions are solely and independently defined by granularity, this type of data view may be logically perceived to equate an asterisk, centred around fact data and viewed through

different dimensions, as shown in Fig.1. We need reliable and sufficient quantity of concurrent data to be able to map facts with dimensions, before this data can tell us the obvious. Even then the

best it can do is provide us trends rather than prophesies.

What do they miss out?

What the method explained above misses out on is analytically (rather than empirically) relating the quantities that make up different dimensions. The method therefore has potential hazard of

actually representing one dimension twice. Worse still, the method has little success in predicting collective effect of multiple dimensions on the system. Neither does this allow the dimensions to

be extrapolated to simpler quantities on one side and to more complex, but meaningful derived quantities on the other. It therefore rarely hits on the set of dimensions that will best predict

facts.

The Car Driving Enterprise

With classical BIS

Let us now assume that our Car Driving Enterprise did not have the benefit of dimensional analysis. What best could it have come up with following conventional BIS methods? It would have first

reckoned the facts that matter most to it, i.e. time of travel, comfort of travel and cost of travel. Further, whereas perhaps it would have perceived at some point that comfort of travel had a lot

to do with avoiding frequent and eccentric use of the pedals, the measurement of this being difficult to imagine without thinking of dimensions and derived quantities, this fact would have been

left out.

Now to get the best time of travel, it would first have been necessary to get ‘statistically significant data’. The car would thus have been driven for a few months on each candidate

route. This would have given the competitive time directly, though it would have meant going late (though mostly too early) to the office until a trend of this time was obtained, i.e. doing

business in a variety of wrong / inefficient ways to find the right / efficient way. What about extrapolating the results if the place of work should change? Well, that is beyond our capacity. The

best that could have been done was to hire a consultancy, provide it with our data of time taken and distance traveled, so it could find for us the average speeds, draw a graph of these against the

time taken and give us an “intelligent guess” as to typically for what minimum approach distance, in say a typical urban road pattern, route with what average speed has possibility of

providing the least time. A conclusion, that would have been too specific, cost us a fortune and still left the usual bit of statistical error and uncertainty hanging around.

Going likewise, the best that we could have done about energy consumption would have been to work out mileage per fuel tank on different routes after a good deal of meticulous observation and then

work out the miles on each route and thus find out only the most direct cost – the cost of fuel for each route.

Thus, our metrics fails to give us any guess on one of our Critical Success Factor and provides rough guesses for the other two that may not be applied universally. All this is due to failure to

understand that the extent and frequency of acceleration – retardation cycles needed on the route, i.e. rate of change of speed over time or distance, was the ultimate fact that we looked for;

something that we could have easily related dimensionally. In this case, it was not a fundamental quantity, neither was it tangible. It though was definitely derivable and could be mapped most

directly to the path of travel. Our data model for the classical BIS approach thus logically looks like Fig.2 having many quantities for fewer predictions and without a definite relation between

some of them.

With Dimensional Analysis

First, as we went about it in our little exercise above, we did not actually require loading our car differently or going at different average speed to gather data, create nice charts and trend

graphs to conclude: loading, but more importantly speeding were bad for our fuel economy. Further, our little deduction on braking indicated that we needed more braking retardation if speed was

more or braking distance allowed was less. Similarly, we needed acceleration to get back to cruising speed after braking, which again would be more to gain back the speed in shortest possible time.

Thus, it brings home, that roads that change in driving conditions remarkably and frequently along their length and thus need high amounts of acceleration / retardation [L1T-2] end up

proportionately heavy on energy [M1L2T-2] i.e. [M1L1] [L1T-2]. Thus, we have the necessary conclusion ready without any delay or need for data gathering or analysis. If we must be able to provide

the value and not the trend or proportionality of the energy consumed, all we need is measure the acceleration and retardation that a car produces against distance that it travels – the job of

device as simple as a spring-loaded plotter on a drum connected to a wheel / driving shaft / odometer of the car. Know the total time of our journey and this simple measurement with bits of

dimensional analysis tells us everything that there is to be told about the enterprise. E.g. the frequency and level of its extremes points give us an indication of extra energy gone into

acceleration and braking, that also corresponds to extra fuel consumed and brake wear caused, which also correspond to jerks and pedalling strain experienced by the driver. We thus have all three

facts that we set about to investigate. Our data model logically looks like Fig.3, having fewer quantities to measure and with all predictions that we need, along with the fact that these are

related.

Generalized view of dimensional analysis

Thus, through our example, we have noted that dimensional analysis:

- Puts seemingly complex quantities down into fundamental quantities
- Allows derived quantities to be related to and expressed through others
- Establishes proportionality rather than just trend of variation of one quantity with others
- Zeroes on quantities most important for establishing facts
- Narrows down business metrics
- Reduces burden of data analysis
- Provides higher accuracy of analysis
- Provides more accurate predictions

It therefore greatly reduces the efforts of data gathering for a greenfield BIS. It also reduces the effort of data analysis for an evolving BIS.

What next?

Having theorized all that is Bold and Beautiful, and got the point home through the simplest example, it’s time that we did more work on real life business problems and streamlined our

methods. After seeing its contribution, over centuries, to complex analysis problems in Physics; I see no impediment to successful use of dimensional analysis for similar real life business

situations, in a typical enterprise BIS. Start applying it on your business problems right away.

The typical steps will be:

- Have your business definition ready
- Find the quantities for which you need the trends (let’s call them Ts)
- Analysis composition of each T in terms of presence of simpler quantities (call them Ss). Let it not bother you whether they are absolutely fundamental, since fundamental is often a paradigm

specific term. - Establish level of influence of the Ss on Ts.
- Establish Dimensions of Ts (do not expect them to come as clean and definite as in our example) in terms of Ss.
- Iterate if necessary to establish dimensionality more accurately
- Work with combinations of the Ss those will themselves make measurable quantities (call them Ms) and will together work out to establish Ts
- If your have a data warehouse / data-mart, locate these Ms in the data available to you
- Devise technique of measuring Ms to establish / check your trend
- Analysis Ms and extrapolate to establish trends in Ts

Feel free to send me cases relevant to your enterprise and we may be able to solve some of them in coming issues, while further perfecting our methods.

*This article was also published in the March 2002 issue of the Journal of Conceptual Modeling.*

www.inconcept.com/jcm

www.inconcept.com/jcm