/dev/reading
Category

Data Architectures

10 books
Order by
View
Patterns for Designing & Building Event-Driven Architectures
by Adam Bellemare

The exponential growth of data combined with the need to derive real-time business value is a critical issue today. An event-driven data mesh can power real-time operational and analytical workloads, all from a single set of data product streams. With practical real-world examples, this book shows you how to successfully design and build an event-driven data mesh.

Building an Event-Driven Data Mesh provides:

  • Practical tips for iteratively building your own event-driven data mesh, including hurdles you'll experience, possible solutions, and how to obtain real value as soon as possible
  • Solutions to pitfalls you may encounter when moving your organization from monoliths to event-driven architectures
  • A clear understanding of how events relate to systems and other events in the same stream and across streams
  • A realistic look at event modeling options, such as fact, delta, and command type events, including how these choices will impact your data products
  • Best practices for handling events at scale, privacy, and regulatory compliance
  • Advice on asynchronous communication and handling eventual consistency
Modern Data Architecture with Data Mesh and Data Fabric
by Piethein Strengholt

As data management continues to evolve rapidly, managing all of your data in a central place, such as a data warehouse, is no longer scalable. Today's world is about quickly turning data into value. This requires a paradigm shift in the way we federate responsibilities, manage data, and make it available to others. With this practical book, you'll learn how to design a next-gen data architecture that takes into account the scale you need for your organization.

Executives, architects and engineers, analytics teams, and compliance and governance staff will learn how to build a next-gen data landscape. Author Piethein Strengholt provides blueprints, principles, observations, best practices, and patterns to get you up to speed.

  • Examine data management trends, including regulatory requirements, privacy concerns, and new developments such as data mesh and data fabric
  • Go deep into building a modern data architecture, including cloud data landing zones, domain-driven design, data product design, and more
  • Explore data governance and data security, master data management, self-service data marketplaces, and the importance of metadata
Delivering Data-Driven Value at Scale
by Zhamak Dehghani

We're at an inflection point in data, where our data management solutions no longer match the complexity of organizations, the proliferation of data sources, and the scope of our aspirations to get value from data with AI and analytics. In this practical book, author Zhamak Dehghani introduces data mesh, a decentralized sociotechnical paradigm drawn from modern distributed architecture that provides a new approach to sourcing, sharing, accessing, and managing analytical data at scale.

Dehghani guides practitioners, architects, technical leaders, and decision makers on their journey from traditional big data architecture to a distributed and multidimensional approach to analytical data management. Data mesh treats data as a product, considers domains as a primary concern, applies platform thinking to create self-serve data infrastructure, and introduces a federated computational model of data governance.

  • Get a complete introduction to data mesh principles and its constituents
  • Design a data mesh architecture
  • Guide a data mesh strategy and execution
  • Navigate organizational design to a decentralized data ownership model
  • Move beyond traditional data warehouses and lakes to a distributed data mesh
by Jacek Majchrzak, Sven Balnojan, Marian Siwiak and Mariusz Sieraczkiewicz

Revolutionize the way your organization approaches data with a data mesh! This new decentralized architecture outpaces monolithic lakes and warehouses and can work for a company of any size.

In Data Mesh in Action you will learn how to:

  • Implement a data mesh in your organization
  • Turn data into a data product
  • Move from your current data architecture to a data mesh
  • Identify data domains, and decompose an organization into smaller, manageable domains
  • Set up the central governance and local governance levels over data
  • Balance responsibilities between the two levels of governance
  • Establish a platform that allows efficient connection of distributed data products and automated governance

Data Mesh in Action reveals how this groundbreaking architecture looks for both startups and large enterprises. You won’t need any new technology—this book shows you how to start implementing a data mesh with flexible processes and organizational change. You’ll explore both an extended case study and real-world examples. As you go, you’ll be expertly guided through discussions around Socio-Technical Architecture and Domain-Driven Design with the goal of building a sleek data-as-a-product system. Plus, dozens of workshop techniques for both in-person and remote meetings help you onboard colleagues and drive a successful transition.

Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh
by James Serra

Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of these architectures to help data professionals understand the pros and cons of each. James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You'll learn what data lakehouses can help you achieve, as well as how to distinguish data mesh hype from reality. Best of all, you'll be able to determine the most appropriate data architecture for your needs. With this book, you'll:

  • Gain a working understanding of several data architectures
  • Learn the strengths and weaknesses of each approach
  • Distinguish data architecture theory from reality
  • Pick the best architecture for your use case
  • Understand the differences between data warehouses and data lakes
  • Learn common data architecture concepts to help you build better solutions
  • Explore the historical evolution and characteristics of data architectures
  • Learn essentials of running an architecture design session, team organization, and project success factors

Free from product discussions, this book will serve as a timeless resource for years to come.

Modern Data Lakehouse Architectures with Delta Lake
by Bennie Haelen and Dan Davis

With the surge in big data and AI, organizations can rapidly create data products. However, the effectiveness of their analytics and machine learning models depends on the data's quality. Delta Lake's open source format offers a robust lakehouse framework over platforms like Amazon S3, ADLS, and GCS.

This practical book shows data engineers, data scientists, and data analysts how to get Delta Lake and its features up and running. The ultimate goal of building data pipelines and applications is to gain insights from data. You'll understand how your storage solution choice determines the robustness and performance of the data pipeline, from raw data to insights.

You'll learn how to:

  • Use modern data management and data engineering techniques
  • Understand how ACID transactions bring reliability to data lakes at scale
  • Run streaming and batch jobs against your data lake concurrently
  • Execute update, delete, and merge commands against your data lake
  • Use time travel to roll back and examine previous data versions
  • Build a streaming data quality pipeline following the medallion architecture
by Danil Zburivsky and Lynda Partner

Centralized data warehouses, the long-time defacto standard for housing data for analytics, are rapidly giving way to multi-faceted cloud data platforms. Companies that embrace modern cloud data platforms benefit from an integrated view of their business using all of their data and can take advantage of advanced analytic practices to drive predictions and as yet unimagined data services.

Designing Cloud Data Platforms is a hands-on guide to envisioning and designing a modern scalable data platform that takes full advantage of the flexibility of the cloud. As you read, you'll learn the core components of a cloud data platform design, along with the role of key technologies like Spark and Kafka Streams. You'll also explore setting up processes to manage cloud-based data, keep it secure, and using advanced analytic and BI tools to analyze it.

Patterns and Paradigms for Scalable, Reliable Services
by Brendan Burns

Without established design patterns to guide them, developers have had to build distributed systems from scratch, and most of these systems are very unique indeed. Today, the increasing use of containers has paved the way for core distributed system patterns and reusable containerized components. This practical guide presents a collection of repeatable, generic patterns to help make the development of reliable distributed systems far more approachable and efficient.

Author Brendan Burns—Director of Engineering at Microsoft Azure—demonstrates how you can adapt existing software design patterns for designing and building reliable distributed applications. Systems engineers and application developers will learn how these long-established patterns provide a common language and framework for dramatically increasing the quality of your system.

  • Understand how patterns and reusable components enable the rapid development of reliable distributed systems
  • Use the side-car, adapter, and ambassador patterns to split your application into a group of containers on a single machine
  • Explore loosely coupled multi-node distributed patterns for replication, scaling, and communication between the components
  • Learn distributed system patterns for large-scale batch data processing covering work-queues, event-based processing, and coordinated workflows
A Guide to Building Robust Cloud Data Architecture
by Rukmani Gopalan

More organizations than ever understand the importance of data lake architectures for deriving value from their data. Building a robust, scalable, and performant data lake remains a complex proposition, however, with a buffet of tools and options that need to work together to provide a seamless end-to-end pipeline from data to insights.

This book provides a concise yet comprehensive overview on the setup, management, and governance of a cloud data lake. Author Rukmani Gopalan, a product management leader and data enthusiast, guides data architects and engineers through the major aspects of working with a cloud data lake, from design considerations and best practices to data format optimizations, performance optimization, cost management, and governance.

  • Learn the benefits of a cloud-based big data strategy for your organization
  • Get guidance and best practices for designing performant and scalable data lakes
  • Examine architecture and design choices, and data governance principles and strategies
  • Build a data strategy that scales as your organizational and business needs increase
  • Implement a scalable data lake in the cloud
  • Use cloud-based advanced analytics to gain more value from your data
The Definitive Guide to Dimensional Modeling
by Ralph Kimball and Margy Ross

The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new and enhanced star schema dimensional modeling patterns, adds two new chapters on ETL techniques, includes new and expanded business matrices for 12 case studies, and more.

  • Authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence
  • Begins with fundamental design recommendations and progresses through increasingly complex scenarios
  • Presents unique modeling techniques for business applications such as inventory management, procurement, invoicing, accounting, customer relationship management, big data analytics, and more
  • Draws real-world case studies from a variety of industries, including retail sales, financial services, telecommunications, education, health care, insurance, e-commerce, and more

Design dimensional databases that are easy to understand and provide fast query response with The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition.