At first, it might seem strange to separate the process of modeling data from the process of designing databases. Not only are these topics very different, but if the first is not done before the second, then you will have both a bad database design as well as a high probability that the databases are not integrated, redundant, and possess conflicting semantics. These are the inverses of desired characteristics.
Simply stated, a schema and its contained tables represent a completely encapsulated set of data specifications within the “hard boundaries” of its schematic (hence, it’s within a schema). In contrast, a model of data is simply a collection of entities, attributes, and relationships among the entities across one or more subject areas. Data models have no real hard boundaries, only abstract boundaries. Data models are supposed to represent the semantics of natural collections of data specifications. Schemas are specifications of semantics of data specifications contained within a database that, in turn, lead to actual instances of stored data. A data model is a data-type collection specification. A database is a data-instance-collection specification. Database models are then the use and re-use of various data-type collections.
Case Study
A school system needed to build a database application (an application where the database is the core component). The problem domain was to track individuals who were being trained to be Health Room Medical Assistants (HMA). The school system has about 140,000 students, exists within an urban setting, and reports to an urban government that exists within a U.S. State.
There are about 210 health rooms across the school system, and whenever the RN (Registered Nurse) is absent from a school, an HMA has to handle the dispensing of medicine, medical record keeping, and first aid. The HMA can only be authorized to perform health-related activities through a “delegating” RN. The HMA “works under” the delegating RN’s license.
School principals are responsible for selecting HMA candidates. They must pass a reading, writing, and arithmetic test. If the test is failed, the HMA candidate cannot come back to the HMA training class for 90 days. The HMA nominee must “sit” for a 40-hour class. There are three tests during the class, and a practical demonstration-based exam at the end. If any test is failed, the HMA candidate is “expelled” and has to retake the entire course starting with the entrance test.
Once an HMA has passed the tests they are deployed. Thereafter, there are follow-up reviews and direct observations every 45 school days. After the first 45-day period, the HMA candidate can apply to the State to become a certified HMA. If after certification the HMA fails a follow-up review and/or any direct observation, the supervising RN must report that failure to the State Board of Nursing. The Board of Nursing may de-certify the HMA. If decertified, the process begins again starting with the entrance test. The rationale for all this is that giving the wrong drugs or dosage to a student, or failing to properly document medical treatments is a serious matter that might cause harm or even death to a student. The State’s Board of Nursing takes this seriously.
The task was to build the HMA database application against these requirements. A nurse supervisor of the school system’s student health organization chose MS/Access to build the application based on guidance from the school system’s IT department. Since the nurse supervisor was told that MS/Access is really simple, and since the nurse supervisor was an HMA subject matter expert, she just fired up Access on her computer and started to type in the database’s design: One table, of course. As she discovered something she forgot, she just modified the database’s one table design, one column at a time.
After about nine months, the one-table database in MS/Access was complete. She started to type in data for about 300 records of HMAs in various stages of the certification process. Of course “she” herself was required to be present in order to understand and explain each and every row of data and all the implied relationships among the columns. There were multiple status codes, some of which are related to others. There was no history.
During a review of the database’s design, and in light of the requirements stated above, questions were asked about the underlying process, test re-takes, test failures, observations, re-certifications, how HMA moves from one school to the next are recorded, how changes in RN delegation are handled, and the automatic refreshing of names, addresses, and schools from the school system’s databases. None of these questions had been accommodated within the database’s design.
A question was then asked, “Where’s your model of the data?” She stated, “Oh, right here.” She showed the MS/Access database table. She was not aware that a model of the data is different from a database’s design. After this review, the next two hours were then spent figuring out the model of the data. At the end of this first data model design session about 15-20 entities were identified and related. At this point, she knew that these requirements were beyond what she knew how to do with Access. What was starting to emerge was the need for: 1) a data model that would mirror the real requirements, and 2) an application system to manage the user interface, underlying processes, data entry, errors, relationships, editing, workflow, and reporting.
What should she have done? That’s the subject of a short paper. The paper was written as if she had all the necessary tools to accomplish all the steps. Did she have these tools? Would the school system expend the resources to make “simple little applications” like these practically possible?