Data Modeling & Enterprise Project Management Part 2: Function-Based Approach


Published in January 2004

Articles in this series – Part 1, Part 3, Part 4, Part 5, Part 6, Part 7, Part 8,

This is the second in a series of articles from Amit Bhagwat.


Data modeling is no doubt one of the most important and challenging aspects of developing, maintaining, augmenting and integrating typical enterprise systems. More than 90% of functionality of
enterprise systems is centered round creating, manipulating and querying data. It therefore stands to reason that individuals managing enterprise projects should leverage on data modeling to
execute their projects successfully and deliver not only capable and cost effective but also maintainable and extendable systems. A project manager is involved in a variety of tasks including
estimation, planning, risk evaluation, resource management, monitoring & control, delivery management, etc. Virtually all of these activities are influenced by evolution of the data model and
may benefit by taking it as the primary reference. This series of articles by Amit Bhagwat will go through the links between data modeling and various aspects of project management. Having
explained the importance of Data model in estimation process and taken overview of various estimation approaches, the second article presents illustrative examples for the Function-based Approach.

A Recap

In last article[i], we established that most enterprise projects create or modify means of creating,
modifying or interrogating simple set of data, including derived data. We therefore inferred that the efforts associated with such a project are best arrived at once the data structure associated
with the end system is understood.

Depending on the nature of the project, the data structure may exist as part of the requirement description, e.g. for an enhancement or reengineering project; or it may have to be created out of
requirements. The last article briefly touched upon how requirements may exist in use case format and how they may go through the process of use case realization for the data structure to emerge.
We also discussed, very briefly, an estimation approach (or rather ‘guesstimation’, as it is often referred to given the possibility of potential error) based on the use cases
themselves, prior to realization. Any further expansion to that is outside the scope of this work and particularly outside the context of this publication. It suffices to suggest that this approach
can be useful if applied in the converging estimation pattern, as explained elsewhere[ii].

We went further in the last article to discuss various approaches of linking data model to estimation and considering impact of funding style on the estimate, in context of the data model. We
resolved to working through these concepts in the forthcoming articles.


It is this working through a simple data model and converting it to numbers under various function-point based approaches that we are going to concentrate on at this point. The case we are choosing
for illustration purposes is essentially a simple one, one that ordinarily would not go higher than the Subsystem level in the solution definition hierarchy. In this issue, we are focusing on the
Function-based Estimation approach.

The Case

As explained above, I mean to take a very simple case for illustrative purposes, one that won’t go much above subsystem level. It is worth noting that scaling does involve economies or
diseconomies and is therefore seldom linear.

We are going to consider the book lending facility at a public library, which lends only books, only to its registered borrowers and free of charge, and accepts returned books with facility for
collecting fines as appropriate. Keeping account of fines is work of the accounting subsystems, contacting borrowers whose borrowings are long overdue, as well as registering and deregistering
borrowers, is work of administration subsystem. Book purchases and sales come under stock management and accounting subsystems. Books are renewed by the enquiry desk and the duration of lending,
the rate of fine and the number of renewals allowed, are set by the administration subsystem, and so on. To view these subsystems in isolation is a bad idea, but we are going to do this (and this
unfortunately is the case with quite a few public libraries) to simplify our example, focusing on the lending facility. This will also allow me to make an illustrative point regards definition of
attributes when we come to it in Data-based Estimation. We’ll likewise assume that a borrower can borrow between 0 and n books where this n is set by the administration subsystem but
interrogated by the lending facility. This means, the lending facility needs to access, but not modify, rate of fine, maximum number of borrowables and maximum duration of borrowing.

If we do not have considerable deliberate denormalization (except in the case of record of Borrowings), the important entities, expressed in UML-like notation, may look as shown in Fig.1.


Fig. 1. : A view of important data elements


The following things therefore happen at the lending facility:

1. Books are issued free of charge. Books are issued provided the borrower borrows within the borrowing limit


2. Books are accepted and if appropriate, fine is collected manually in cash and a receipt issued, alternatively, a fine ticket is issued. (Its further processing is managed by administration and
account, and is therefore outside the scope of the lending facility.)

Important activities and some states of the borrowing and return procedure are shown in Fig. 2 and Fig. 3 respectively.


Fig. 2. : Lending facility activities involved in book borrowing procedure



Fig. 3. : Lending facility activities involved in book returning procedure

Function-based Estimation

As we referred in the last article, the Function-based approach associates required functionality with so called transactions, here meaning data operations. Let’s attempt to find these now,
by first identifying sequence of events. I’ll put actions that are outside the scope of lending facility in blue italics.

Issuing Books

Recognize borrower id


Identify user


Interrogate Borrower data on Borrower ID provided by borrower card recognition facility


Give borrower recognition message


Interrogate Current Borrowing data to find borrowing count for the Borrower ID


Provide message to indicate borrower’s available borrowings


Identify Item


Interrogate Borrowable Item data for Item ID provided by Item recognition facility


Register a borrowing to the borrower


Create a Current Borrowing with information about Borrower and Item already identified


Stamp the borrowing for due date


Interrogate Current Borrowing data to find borrowing count for the Borrower ID


Provide message to indicate borrower’s available borrowings

Returning Books

Identify Borrowing


Interrogate Current Borrowing data for Item ID provided by Item recognition facility


Enter Returned Date for the Borrowing Fine Check


Find if the item is overdue, if so notify


Consider whether to charge fine for an overdue item


As per input, set ‘Is Charged Fine’ flag for the Borrowing


Add the Borrowing to Past Borrowing and remove the Borrowing from Current Borrowing


Fine Collection Procedure


As necessary, create Fine(s) and Total Fine


From input given by the operator, identify whether the Borrower has paid cash fine on the spot.


Accordingly issue Total Fine receipt or ticket

The next steps to be taken are to check that transactions are not omitted or duplicated. A thumb rule for checking the former is ensuring that the facility has all data it needs and can give all
data that it is required to. This is, among other things, also a function of the analyst’s experience. We check duplication by revisiting each transaction in context of an entity, with
respect to other transactions associated with that entity.

Let’s apply this to our case.

We have 7 entities under consideration, two of these are sister entities derived from the same base – Borrowing (itself an abstract 8th entity). Let’s examine what actions is our
facility supposed to take on each of these and whether there are two transactions taking the same action.

Borrower: The facility needs to identify the borrower using its ID. If an item is overdue while being returned, the borrower should be identified from the borrowing and, if commanded, be associated
with the fine procedure. These two borrower interrogations have commonality, whether the borrower ID comes from the Borrowing or from the ID card scanning device.

Borrowing: The lending facility is in complete control of this, it creates Borrowing as Current Borrowing, interrogates Borrowing to find Borrower ID while issuing (to check against the borrowing
limit) and while accepting (return with delay). It edits Borrowing to reflect return date and whether a fine is charged, deletes Current Borrowings and creates Past Borrowings.

Borrowable Item: This is interrogated for the purpose of borrowing and returning. It is fair to believe that each time the item is identified, its description is displayed to the operator by
interrogating on the Book data. On the other hand, there is no independent interrogation for Book data, and all enquiries on Books for their title, author, etc. are dealt at the enquiry helpdesk,
outside the scope of our facility.

Fine: An overdue borrowing identified during return, when authorized to charge fine, starts a session under a total fine ID which ends when either the Fine receipt or the ticket is printed out.
Thus such a session creates a Total Fine with one or more Fines as part of it. A person may be returning Borrowings of multiple Borrowers and therefore a multiple of these sessions may run
parallelly, until the operator terminates them by commanding to print a receipt or ticket for each.

Other entities: The system also interrogates simple data that corresponds to library policy for:

  1. Maximum allowable Current Borrowings
  2. Maximum duration of borrowing
  3. Rate of Fine

The lending facility in our example does not need to interrogate on maximum number of renewals allowed, for if they are allowed they are responsibility of the enquiry desk.

We therefore seem to have the following transactions:

  1. Identify Borrower from borrower ID
  2. Identify Borrowable Item and associated Book from Item ID
  3. Identify Current Borrowings for a Borrower and display the number of Borrowings that may be borrowed
  4. Identify Borrowing details for a Current Borrowing
  5. Create Current Borrowing
  6. Edit Current Borrowing with returned date and whether fine is charges
  7. Delete Current Borrowing and create Past borrowing
  8. Notify of a returned borrowing attracting fine
  9. Create Fine, & create or edit Total Fine
  10. Interrogate library policy data
  11. Identify Borrower from Item ID
  12. Print Fine receipt or Ticket

Further observation reveals that second operation, although touching two entities, is essentially a unit activity. The seventh transaction is likewise hinting at a concurrent create and delete
operation. In some respects, the effect is similar to a ‘transfer’ or update operation. The ninth operation involves create and / or update of two related entities, however Total Fine
is updated independently based on whether or not the Borrower pays there and then. We may therefore operationally restructure and redefine the transactions as:

  1. Identify Borrower from borrower ID
  2. Identify data for the Borrowable
  3. Identify current Borrowings for a Borrower and display the number of Borrowings that may be borrowed
  4. Identify Borrowing details for a Current Borrowing
  5. Create Current Borrowing
  6. Edit Current Borrowing
  7. Update Borrowing information
  8. Notify of a returned borrowing attracting fine
  9. Create Fine and create / update related information
  10. Set Total Fine as due or paid
  11. Interrogate library policy data
  12. Identify Borrower from Item ID
  13. Print Fine receipt or Ticket

It is beyond this point that actual numbers come into play, most derived from empirical data analyzed from industrial sources. Let me therefore refer to Charles Symon’s work[iii] hereon for this purpose. There is a ‘better method’ here, which looks at the details transaction by
transaction. Let’s first examine this.

We first attempt to find the likely number of inputs, entities and outputs associated with each transaction. With increasing use of data-bound views, and container-based unified validations however
there is a case for deliberating on whether the definition of inputs and outputs needs to be reconsidered. For example, an input validator control or data-bound display control may exist for
dealing with contact information, needed at so many places; in that case, the developers of the system may simply drop this control onto their development environment without needing to consider
the input and output fields within it separately. Indeed, the database may store the entire data as a single hierarchical structure under XML say, or even something like a BLOb, which among other
things, can store aerial photograph of the location. For purpose of our Mickey Mouse example however, let’s assume the simplest and crudest development environment. Let’s therefore
believe for illustration purposes that we have arrived at the following table.


The Unadjusted Function Point count (UFP) is then given as:

0.58 x inputs + 1.66 x entities + 0.26 x outputs (note that as this is function-based approach, we are taking cumulative count through all functions, and not referring to say the number of entities
in the model. That will come in data-based approach, with some very interesting results!)

In out illustration, this is,

0.58 x 23 + 1.66 x 33 + 0.26 x 41

= 78.78 ~ 79

Shortcut method for Function-based estimation

Now a brief look at the shortcut method. Here each transaction is CRUDly (creates, reads, updates or deletes) branded. It is further pigeonholed based on its perceived complexity; Next, total
number of each type of transactions (e.g. Complex Create, Simple Delete, etc.) is worked out. Finally, a Weighting Multiplier is applied to each type and all multiplications summed up, giving an
approximate UFP result. The tables for guideline bound values to establish complexity of transaction and for the multipliers may be found in Charles Symons’s source acknowledged earlier.

This approach is essentially an approximation, as it depends on bounds rather than exact values, and attempts to classify transactions precisely, based on their CRUD class. It is naturally not
applied when each transaction could be dealt individually as illustrated in the earlier section.

Converting Function Points to Estimates

Very briefly, the Unadjusted Function Point count yields the Function Point Index, when multiplied by the so called Technical Complexity Adjustment (TCA). The TCA itself is influenced by the
‘degrees of influence’. Symons’s technique is based on 19 such elements, each having a variety of influence score to choose from. Determining degree of influence, and therefore
TCA, is somewhat subjective. Usually a consensus approach is therefore considered most desirable. TCA can take value as low as 0.655, thereby bringing the FPI down compared to UFP, for the simplest
systems that offer virtually no technical challenge. The FPI goes linearly as ‘work’ and serve as input to other management equations that require this quantity to decide optimal
project mass, delivery schedule and budget.


In this article, I have illustrated lines of processing associated with the Function-based Estimation Approach. The recourse to isolating a subsystem was to keep it short and simple. I hope however
that the reader has got the gist of what it is like to use the FB approach, which was the purpose of this tiny exercise.

What’s next

In the future articles, I plan to discuss the Data-based estimation approach and its quicker but even more approximate version. In the present example, we have isolated a subsystem, where many
entities are not controlled by the subsystem itself. If therefore, we were to apply the data-based approach, as is, to the entities we have found here, we’ll come up with a ridiculously high
value of UFP; as in using the data-approach with all these entities, we’ll be assuming total control and responsibility for all of them. I’ll therefore need to trim the entity diagram
to zero on only those entities that the subsystem has complete control on, before applying the data-based approach.


[i] Amit Bhagwat – Data Modeling & Enterprise Project Management, Part 1: Estimation
– TDAN (Issue 26)

[ii] Amit Bhagwat – Estimating use-case driven iterative
development for “fixed-cost” projects – The Rational Edge (October 2003)

[iii] Software Sizing and Estimation: MKII Function Point Analysis, John Wiley & sons.


submit to reddit

About Amit Bhagwat

Amit Bhagwat is an information architect and visual modeling enthusiast, in the thick of object oriented modeling and its application to information systems. He has developed a specialised interest in applying techniques from pure sciences to data modeling to get the best out of MIS / BIS. He is an active member of the precise UML group and happens to be the brain father of Projection Analysis - - and Event Progress Analysis - - techniques. He also maintains contents for the celebrated Cetus Links - - in Architecture and Design areas. He shares a variety of other interests including photography, poetry and sociological studies. Explore some of his work at: