The Book Look: Designing Data Products: The Data Products Series Volume I

Data products are a very important part of modern data architecture. Author Mario Meir-Huber is writing a trilogy on data products, where Volume 1 is about designing data products, Volume 2 is about building data products, and Volume 3 is about scaling data products. Volume 1 has recently been released — “Designing Data Products: The Data Products Series Volume I” — and that is the subject of this book review.

This volume explains what “data products” are, why they matter, and how to build them without getting buried in jargon. The book assumes no prior knowledge of data products and is written for product owners, analysts, data engineers, architects, and project managers. The book’s mantra is to treat data like a product that must deliver value, stay usable, and improve over time.

Chapter 1, the Foundation, frames data products as an answer to the pendulum swing between strict centralization and full decentralization, where problems usually get shifted rather than solved. Data products stand on their own, borrowing proven “old world” practices while adopting newer ideas from data mesh and data fabric, with an emphasis on gradual improvement instead of reinvention. A data product is defined as something that “delivers value to its consumers,” and the chapter emphasizes quality and reliability as non-negotiable requirements for value. It also differentiates data products from data assets and data projects.

Chapter 2 on the GAP (not the clothing store :L) covers the powerful triad of Governance, Architecture, and People. Governance builds trust and compliance; architecture provides the backbone for robustness and scalability; and people and culture determine adoption and continuous improvement. The chapter is pragmatic about how organizations react to the word governance, suggesting you focus on “high-quality data products” rather than trying to win a political argument about governance programs. GAP is not a side initiative, but instead must be embedded into how products get built and operated.

Chapter 3 is about DRIVE. DRIVE lays out the lifecycle of a data product, from retrieval to integration to value extraction, with “output ports” as the concrete artifacts users come into contact with. Value extraction is the point where data stops being “a pipeline” and becomes visible, usable, and impactful through dashboards, APIs, machine learning, or other interfaces. The chapter also stresses that usability requires documentation and training, not just access. It then introduces CIA, Continuous Improvement and Adaptation, as the mindset that spans the lifecycle: Data products are never “done,” and should evolve based on feedback, shifting business needs, and changing technology.

Chapter 4 is about practice, turning the product conversation into a decision framework. How do we calculate and communicate value? How do we prioritize what to build? It distinguishes between financial and strategic impact and warns that high impact alone does not mean “build it next” if feasibility is low. Feasibility is grounded in GAP thinking, so technical readiness and organizational capability matter as much as the business case.

Chapter 5, scalability, focuses on organizational design. That is, whether central, decentral, or something in between. The book argues that neither extreme is a universal answer and proposes a hub-and-spoke model that combines centralized standards and shared platforms with decentralized domain ownership and prioritization. This chapter covers roles, highlighting the data product owner as the business-embedded decision-maker, along with execution roles such as squad lead, proxy product owner, and tech lead, and chapter leads for people development. The chapter closes with an agile team structure inspired by the Spotify model, separating staff management from delivery so squads can move faster without overloading managers.

Chapter 6 focuses on retrieval. This is how data enters the product (either through stream or batch). The chapter emphasizes concepts such as data contracts (to ensure producers and consumers agree on shape and expectations), change data capture, and the need for reliable orchestration and repeatable pipelines. The broader message is that “getting the data in” is not a trivial first step, as it sets the quality ceiling for everything downstream.

Chapter 7 covers integration, which is the process of transforming raw inputs into reliable, usable models. The chapter introduces the medallion architecture (bronze, silver, gold) to make data maturity explicit: Bronze is raw and close to the source, silver is cleaned and validated against standards, and gold is refined for business consumption and becomes the visible layer of the product. From there, it reviews modeling trade-offs, including normalization vs. denormalization, star and snowflake schemas, and higher-level strategies like dimensional modeling and data vault, with guidance on where these fit across layers. The chapter’s quality theme is consistency: Avoid competing ingestion routes and multiple models of the same entity and use approaches like master data management where appropriate.

Chapter 8 on extraction is the final chapter. This chapter treats rollout and adoption as part of the product, not an afterthought. It describes practical change management tactics like feedback channels, regular updates, and building communities of “change ambassadors” who help adoption spread through informal trust. It also tackles democratizing access, pointing to the enterprise data catalog as the “front door” for discoverability and requestability, and stressing that policy and lawful execution come before tooling. The chapter then surveys access strategies: BI for broad business use, APIs for integration into systems and workflows, AI/analytics use cases built on large volumes of granular data, and data spaces for cross-company sharing where sovereignty and trust matter.

Volume 1 of the trilogy is a compact roadmap for treating data work as product work: Define value, build end-to-end, embed governance, architecture, and people into the process, and keep improving after launch. If you want a clear mental model you can reuse in meetings, planning sessions, and technical design reviews, this is worth reading straight through, then revisiting chapter-by-chapter as you start building (or rescuing) real data products in your organization.

I am looking forward to reading Volume 2 on building data products when it comes out later in 2026!

MenuMenu

The Book Look: Designing Data Products: The Data Products Series Volume I

Steve Hoberman

MenuMenu

Share this post

Steve Hoberman