Data Warehousing, Data Mining & OLAP

Authors: Alex Berson and Stephen J. Smith
Publisher: McGRAW-HILL (ISBN 0-07-006272-2)

Data Warehousing, Data Mining, & OLAP, written by Alex Berson and Stephen J. Smith (Computing McGraw-Hill 1997), focuses on data delivery as a top priority in business computing today. The
authors use the forward to specify the three areas of data warehousing to be covered in the book as 1) bringing data necessary for enhancing traditional information presentation technologies into a
single source, 2) supporting online analytical processing (OLAP), and 3) the newest data delivery engine, Data Mining.

The book is broken into five parts, Foundation, Data Warehousing, Business Analysis, Data Mining, and Data Visualization and Overall Perspective. Each part goes into a tremendous amount of detail
starting general and moving to the specific, detailing at least five long chapters within each section.

The Foundation section begins by introducing the data warehouse, presenting an overview of client/server architectures and presenting parallel processors and cluster systems. The section continues
by discussing distributed database management systems, and by individually offering an overview of major client/server RDBMS database environments such as Oracle, Informix, Sybase, IBM’s DB2,
and Microsoft MS-SQL Server. This section builds a tremendous foundation of warehousing technology by detailing hardware architectures, multiprocessing architectures, and RDBMS features and
solutions.

The second section, Data Warehousing, begins by detailing data warehousing components and the processes of building a data warehouse. This section of the book details mapping the warehouse to the
parallel processing architectures, selecting database schemas for decision support, the process of extracting, cleaning, and transforming data, and describes meta data as a key component of
supporting the knowledge workers. The chapters go into tremendous details, discussing tool requirements and offering a look at tool-by-tool vendor-based solutions.

The Business Analysis section of this book begins by breaking reporting and query tools into categories including reporting tools, managed query tools, executive information system (EIS) tools,
OLAP tools, and data mining tools. The authors talk about the need for developing reporting applications and then discuss many of the most recognized reporting and querying tools on the market
today. The chapters in this section also detail OLAP (what it is and and why it is necessary), introduces patterns and models for business analysis, explains different types of statistical
analysis, and delves briefly into the technologies of expert systems and artificial intelligence.

The fourth section, Data Mining, introduces the topic by discussing its motivation, measuring its effectiveness, and by defining the difference between discovery and prediction. The first chapter
in this section talks about the state of the data mining industry and compares the present technologies to that of days in the recent past. The rest of the chapters in this section discuss decision
trees, neural networks, genetic algorithms and rule induction. The section wraps up by helping the reader to select and use the right tools.

The final section, Data Visualization and Overall Perspectives pull together the information from the previous sections. In this section, the authors assume a basic understanding of what was
delivered in the other sections. This section focuses on “putting it all together” by discussing scalable solutions, the data warehouse market, costs and benefits of data warehousing, and by
describing Berson and Smith’s impressions of what is to come (and may already be here) in the field of data delivery. These impressions cover distributed warehouses, internet/intranet for
information delivery, object-relational databases, and very large databases (VLDBs).

The appendixes of the book provide additional information beyond that already detailed in the sections and chapters described above. The appendixes include a detailed glossary of business and
technical terms used and discussed in the chapters, a section on improving return on investment (ROI), Dr. E.F. Codd’s twelve guidelines for OLAP, and the Data Warehousing Institute’s
ten mistakes for data warehousing managers to avoid.

With this book, Data Warehousing, Data Mining, & OLAP, Alex Berson and Stephen J. Smith have delivered an important reference for all individuals developing data warehouses right now. The book
provides a level of detail that is hard to find in one place anywhere. Through their ability to introduce, define, and detailed all aspects of data delivery, and the depth of information about
tools presently on the market, this book will be a tremendous tool and reference guide to any individual responsible for delivering data to the corporation.

Share this post

Robert S. Seiner

Robert S. Seiner

Robert (Bob) S. Seiner is the President and Principal of KIK Consulting & Educational Services and the Publisher Emeritus of The Data Administration Newsletter. Seiner is a thought-leader in the fields of data governance and metadata management. KIK (which stands for “knowledge is king”) offers consulting, mentoring and educational services focused on Non-Invasive Data Governance, data stewardship, data management and metadata management solutions. Seiner is the author of the industry’s top selling book on data governance – Non-Invasive Data Governance: The Path of Least Resistance and Greatest Success (Technics Publications 2014) and the followup book - Non-Invasive Data Governance Strikes Again: Gaining Experience and Perspective (Technics 2023), and has hosted the popular monthly webinar series on data governance called Real-World Data Governance (w Dataversity) since 2012. Seiner holds the position of Adjunct Faculty and Instructor for the Carnegie Mellon University Heinz College Chief Data Officer Executive Education program.

scroll to top