Data Modeling & Enterprise Project Management – Part 8

Published in TDAN.com January 2006
Previous articles in the series:
Part 1Part 2Part 3Part 4Part 5Part 6Part 7

This is the eighth in a series of articles from Amit Bhagwat.

Abstract

Data modeling is no doubt one of the most important and challenging aspects of developing, maintaining, augmenting and integrating typical enterprise systems. More than 90% of functionality of enterprise systems is centered round creating, manipulating and querying data. It therefore stands to reason that individuals managing enterprise projects should leverage on data modeling to execute their projects successfully and deliver not only capable and cost effective but also maintainable and extendable systems. A project manager is involved in a variety of tasks including estimation, planning, risk evaluation, resource management, monitoring & control, delivery management, etc. Virtually all of these activities are influenced by evolution of the data model and may benefit by taking it as the primary reference. This series of articles by Amit Bhagwat will go through the links between data modeling and various aspects of project management. Having explained the importance of Data model in estimation process, taken overview of various estimation approaches, presented illustrative example for them, considered importance of intermediate and derived data and effect of Denormalization / normalization, and interpreted generalization relationship from Relational DB perspective, the series proceeds to looking at the data with an OODB approach.

A Recap

In the first article[1] of this series, we established data-operation to be the principal function of most enterprise systems and inferred that data structure associated with a system should prove an effective starting point for estimating its development. We also discussed briefly the pros and cons of requirement-based vis-à-vis solution-based estimation.

In the next two articles[2] [3] we took a simple example to illustrate the function-based estimation approach and its simplified derivation in the data-based approach, highlighting the importance of considering only the data owned by the system and using the data-based approach only as pre-estimate / quick-check. We continued with this example to illustrate the effect of intermediate and derived data[4], and that of denormalization / normalization[5] on estimation. In the sixth article[6] we endeavored to interpret the inheritance relationship, which is at the heart of the OO paradigm, from a relational perspective.

In the last article[7] we worked on interpreting data associated with an object, understanding object constraints and appreciating ownership of data by objects to be able to provide the services that justify existence of those objects.

The conclusions were:

  1. Objects may carry intrinsic and extrinsic data that respectively determines their state and identity (including relationship).
  2. Intrinsic data associated with an object determines state of the object and thereby dictates response of that object to an external stimulus. The stimulus in turn may lead to a transformation in the state of the object.
  3. Well designed objects usually have ability to change their intrinsic data, whereas their extrinsic data associates itself with them from the context of their creation and can become a source of their destruction.
  4. Constraints can be applied on object behavior and relationship, based on state of the object, in order to get the desired behavior out of objects
  5. Good OO design delegates work and distributes responsibilities, thereby allowing objects to collaborate

Agenda

In this article we shall look into using an OODB data structure as basis for estimation. We shall use the example followed since second article in this series and reinterpret the function-based and data-based approaches considered in the second and third article respectively, revisiting them in context of OODB data structure. We are assuming here that the OODB does not allow multiple inheritance (as indeed OODBs, as a rule, do not). Finally, we shall touch very briefly on some other estimation approaches relevant to the object paradigm.

Before we proceed, it will be useful to have for our ready reference a view of important data elements in our illustrative example.

Fig. 1. : A view of important data elements

The OO paradigm

Now that the OO paradigm is assuming the status of de-facto development paradigm, there is a trend among the enthusiasts and spin-doctors alike to attribute all sorts of miraculous qualities to it. It is true that the OO paradigm has several advantages, and I, for one, am its ardent advocate; however it is worth dispelling myths about it that can only lead to disappointment.

The strength of OO platforms does not lie in shrinking a greenfield application ‘to the size of a pea’, nor, strictly speaking, do OO platforms offer phenomenal code-reuse over and above that
offered by well-structured and modularized code on non-OO platforms. True, OO platforms can reduce the level of recompiling involved and OO paradigm encourages development of modular, hierarchical
and functionally well-factored code as is far less likely to materialize in mediocre non-OO design; however, comparing well-designed non-OO development with its OO counterpart, one should not
anticipate reduction in development efforts in a greenfield scenario.

The strength of OO platforms, apart from the more nature-like and better ordered thinking approach that OO paradigm promotes, lies in much higher level of maintainability, greater amenability to
incremental hot-enhancement (quicker incremental enhancement without significant reassembly) and simpler rules of scalability that OO platforms offer.

So, talking of development effort, OO platforms, with due application of the underlying OO paradigm, do not reduce development effort in greenfield scenario, but make effort required for
maintenance and enhancement low and relatively straightforward to compute.

Of course, depending on the level of IDE and CASE support that the development environment offers, support which seems to be offered more readily and to a greater degree of sophistication in OO
environment, the adjustment multiplier (referred to as Technical Complexity Adjustment) and thus final Function Point Index (FPI) will vary. This aspect though is out-of-scope for this series, as
we are concentrating here on relevance and effect of the data involved and therefore confining ourselves to computation of Unadjusted Function Points (UFP)


Function-based approach

Let me take you back to article-2 of this series. Here, we were interested in inputs, outputs and entities involved in context of the ‘transactions’ that defined functionality required of our
subsystem. Now, in OODB terms, we will talk of classes rather than entities.

Classes may be abstract or instantiated. Abstract classes, though they require efforts on their development, get used in the application when they are inherited and instantiated. Indeed, it is not
particularly object-oriented to have significant quantity of static (i.e. class-level rather than instance-level) elements to the class. The development efforts associated with an abstract class
therefore, in their turn, reduce development efforts on at least 1 inherited instantiated class (which we assume must exist to justify existence of an abstract class). In relational terms
therefore, the data manipulation efforts involved with this abstract class and its inherited instantiated classes may be likened to those spent on a dominant entity (which represents one of the
instantiated inherited classes and carries attributes of the abstract superclass, plus, potentially, some of its own) and its subsidiary entities (which represent attributes specific to other
instantiated inherited classes).

The transaction table in article-2 can therefore look very similar and in general the adjustments made by the analyst to output count from understanding of the domain, continue to apply:

A similar estimation formula can then apply to obtain unadjusted function points.

0.58 x inputs + 1.66 x instantiated data classes + 0.26 x outputs

= 0.58 x 23 + 1.66 x 33 + 0.26 x 41

= 78.78 ~ 79

The inheritance relationships in our example present a dilemma though. The original design considered a template entity Borrowing. This manifested as mutually exclusive Present and Past Borrowing
entities; however it was the Borrowing that was unique (thus also making instances of its mutually exclusive subclasses unique).

The way the subsystem was conceived, Present Borrowing was Created, Read, Updated and Deleted, whereas Past Borrowing was Created as a byproduct of Delete of Present Borrowing and remained thus
ever after. In that design, use of the two entities was justified on the grounds of efficiency, which meant that the frequently interrogated Current Borrowing table had to be kept lean while Past
Borrowing kept growing and could be archived / scrubbed as separate maintenance activity.

In the last article we delved upon envisaging an intrinsic derived property isBorrowed with a Borrowing, derived based on the Returned Date attribute not being set.

We suggested that if the Returned Date for the borrowing was not set, then it behaved differently from when the Returned Date was set (and so logically it became a past borrowing), in that:

  1. The object was pooled
  2. The object counted towards Borrower’s quota of Borrowings
  3. Associated Borrowable Item was locked

The object pooling and object constraints mechanisms offered by modern OO platforms effectively eliminate the need for mechanical marshalling of Borrowings from Present to Past and thus associated
data operations.

This can therefore bring down both the transaction count and the entities associated with transactions. In our simplistic subsystem, we had decided that renewals was not within scope of our
subsystem and so the only occasion for editing a Present Borrowing was when it was returned. We then had a separate transaction Update Borrowing Information to marshal the returned current
borrowing into past borrowing.

So with judicious use of object technology, including that of platform-supported pooling and constraint mechanism, allowing design to take a more life-like form, we can club transactions 6 and 7
into 6 alone and reduce UFP by 0.58 + 1.66 x 2 + 0.26 = 4.16

The object paradigm also changes the concept of transaction. An object is responsible for not only performing operations on data it owns but also notifying its subsidiary objects so they may
perform necessary operations on data they own and cascade the event further, as necessary. This means that transaction 8 also loses its status as a separate transaction. The functionality is still
important, but there is no separate accessing of Borrowing class. Rather the Borrowing object has the constraint to notify fine to the system operator and effect spawning of its subsidiary object –
Fine. So explicit functionality of transaction 8 is reduced to its output, thus causing a further drop of 0.58 x 2 + 1.66 = 2.82. Transaction 9 occurs subject to the operator’s behaviour. However,
once again, should it happen, it too will have its UFP count reduced by 1.66.

In all, the UFP count has come down by 4.16 + 2.82 + 1.66 = 8.64 or down to about 70. How cumbersome the application of constraints will be in the development process and which direction will that
move the adjustment multiplier (TCA) will, as stated before, remain separate considerations.


Data-based approach

In article-3 of this series, we worked through the underlying assumptions behind the data-based approach – the quick-but-approximate variant of the function based approach. The final simple formula
was:

1.42 A (1 + (R/E)) + 8.58 E + 13.28 R

Given that this formula is quite approximate and assumption riddled, we may be able to get a very rough figure for UFP, if we:

  1. Substitute OODB classes owned by the system for E
  2. Do not count generalization as separate relationship
  3. Consider attributes of an OODB parent class and any additional attributes of the inherited OODB classes as those of a single class, add a notional attribute to the parent class for each
    generation of its inherited classes (e.g. if X and Y inherit Z and W inherit X then we have 2 generations of inherited classes: the children X & Y, the grand-child W)
  4. Follow other assumptions applied in article-3

This will result in the figure obtained in article-3, i.e. UFP = 78

However, this assumed, among other things, that each entity was involved in one instance of Create, Read, Update and Delete. The three entities (or their equivalent OODB classes) however are
conspicuous in not being involved in deletion.

So our treatment in article-3 modifies to:

The total number of transitions (T) = 3E.

EPT (entities per transaction) = 1 + (2R /E) = (2R+E) / E

FPT (Fields per transaction) = A/E + RA / E2

Other assumptions:

  1. Each create / update transaction involves FPT inputs and nominal (1) output
  2. Each read transaction involves nominal (1) input and FPT outputs
  3. 2/3 of total transactions are create / update
  4. 1/3 of total transactions are read

Therefore cumulative input count may be arrived at as sum of input count from various types of transactions

I = (FPT)(2T/3) + (T/3) = (2FPT + 1) (T/3)

Substituting FPT = A/E + RA / E2 and T = 3E, we get:

I = (2A/E + 2RA / E2 + 1) (E)

= (2A + 2RA/E + E)

Likewise,

O = (2T/3) + (FPT) (T/3)

= (2E) + (A/E + RA / E2 ) (E)

= 2E + A + (RA/E)

And,

ET = (EPT) (T)

= ((2R+E) / E)(3E)

=6R +3E

The final formula for unadjusted function points,

UFP = 0.58 x I + 1.66 x ET + 0.26 x O

therefore transforms into

0.58 x (2A + 2RA/E + E ) + 1.66 x (6R +3E) + 0.26 x (2E + A + (RA/E))

= 1.16A + 1.16RA/E + 0.58E + 9.96 R + 4.98E + 0.52E + 0.26A + 0.26RA/E

=1.42A + 1.42RA/E + 6.08E + 9.96 R

=1.42A (1 + (R/E)) + 6.08E + 9.96 R

Thus for A = 11, E = 3, R = 2

UFP = 15.62 (5/3) + 18.24 + 19.92

= 26.03 + 18.24 + 19.92

= 64.19 ~ 64

i.e. underestimate, but just within 10% margin

There are two ways in which this quick-but-approximate approach may be harnessed.

If there appears a significant proportion of OODB classes that are not destroyed, then use the two approximate calculations, i.e. one considering involvement of each OODB class in a deletion
operation and the other considering only create, read and update. The two values obtained will usually give a range within which more accurate function point count will fall.

The other way is by assuming that a fraction Z of all the OODB classes E, is not going to be involved in a deletion operation, while the rest, i.e. (1-Z)E will be involved in deletion operation. We
can then expand the mathematical treatment and then identify those OODB classes that are not going to be involved in a deletion operation, thereby knowing Z. This will give the assumed value of
total number of transactions as 3ZE + 4(1-Z)E. This approach may fall a little closer to the more accurate function-based estimate, but will also consume additional time in cherry-picking the OODB
classes unlikely to be involved in a deletion transaction.


Other Approaches

As discussed in the first article, estimation approaches can be broadly classified into those used while the problem is being understood and others applied when the solution is designed to varying
degree of completeness. Function Point Analysis belongs to the second category and has slight accuracy advantage over the first category, at the cost of extra time passed in the project before it
can be applied.

There are many techniques of varying popularity under the first category and I have mentioned some in the first article, including some of my own efforts[8].
Leslee Probasco has summarised in a snippets in The Rational Edge[9] work of Gustav Karner with Ivar Jacobson, on Use Case Points, in some respects likened to
FPA, but essentially in the requirements rather than solutions domain. Card et. al.[10] have likewise deliberated on the Task Point system which is, once
again, functionally inclined. It goes down to atomic task and thus needs more time than Use Case Point, but prove a little more accurate. On the other hand, Task Points are a little less accurate
compared to Function Points, but are obtained earlier that Function Points.

We have been working through the series taking the Mark II branch of FPA more popular in the old world[11]. Whitmire[12], beginning with the American variant of FPA, has worked on a Function Points approach for OO.


Conclusions

In this article, we applied the function-based and data-based FPA techniques in context of data in OODB form. We concluded:

1. Function-based FPA can be applied to OODB by substituting entity count by OODB class count
2. Data design can change in OODB paradigm, due to, among other things:
a. The concept, central to the object paradigm, of ownership of own data and responsibility towards dependent data
b. The concept of object constraints and support for their implementation on an OO platform
c. Run-time mechanisms such as object pooling that can impact the physical design of data
3. The data-based approximation of FPA can be tailored to accommodate varying situations, such as lower involvement of data in deletion transactions
4. For all its simplicity and speed, the data-based approach is essentially approximate
5. Estimation is best performed recursively, getting closer to accuracy moving from functional requirements to task counts to solution structure


What’s Next

We have now reached a logical point where we can step out of the Estimation thread and move on to the next thread – planning, unless the readers would like elaboration on any topic within the
Estimation thread.

——————————————————————————–

[1] Amit Bhagwat – Data Modeling & Enterprise Project Management, Part 1: Estimation – TDAN (Issue
26)

[2] Amit Bhagwat – Data Modeling & Enterprise Project Management, Part 2: Estimation Example – The
Function-based Approach
– TDAN (Issue 27)

[3] Amit Bhagwat – Data Modeling & Enterprise Project Management, Part 3: Estimation Example – The Data-based
Approach
– TDAN (Issue 28)

[4] Amit Bhagwat – Data Modeling & Enterprise Project Management, Part 4: Estimation – Considering Derived
& Intermediate Data
– TDAN (Issue 30)

[5] Amit Bhagwat – Data Modeling & Enterprise Project Management, Part 5: Estimation – Considering Effect of
Denormalization / Normalization
– TDAN (Issue 31)

[6] Amit Bhagwat – Data Modeling & Enterprise Project Management, Part 6: Estimation – Interpreting
Generalization with Relational Approach
– TDAN (Issue 32)

[7] Amit Bhagwat – Data Modeling & Enterprise Project Management, Part 7: Estimation – Interpreting
Generalization with Relational Approach
– TDAN (Issue 34)

[8] Amit Bhagwat – Estimating use-case driven iterative development for
“fixed-cost” projects
– The Rational Edge (October 2003)

[9] Leslee Probasco – What About Function Points and Use Cases? – The
Rational Edge (August 2002)

[10] Card, Emam, Scalzo – Measurement of Object-Oriented Software Development Projects, 2001, Software Productivity Consortium

[11] Maintained by United Kingdom Metrics Association

[12] Scott A. Whitmire – 3D Function Points: Applications for Object-Oriented Software, Proceedings: Applications of Software Measurement Conference,
1996.

Share

submit to reddit

About Amit Bhagwat

Amit Bhagwat is an information architect and visual modeling enthusiast, in the thick of object oriented modeling and its application to information systems. He has developed a specialised interest in applying techniques from pure sciences to data modeling to get the best out of MIS / BIS. He is an active member of the precise UML group and happens to be the brain father of Projection Analysis - http://www.inconcept.com/JCM/June2000/bhagwat.html - and Event Progress Analysis - http://www.tdan.com/special003.htm - techniques. He also maintains contents for the celebrated Cetus Links - http://www.cetus-links.org - in Architecture and Design areas. He shares a variety of other interests including photography, poetry and sociological studies. Explore some of his work at: http://www.geocities.com/amit_bhagwat/

Top