Data mesh is the latest trend to grip the data and analytics sector. The term has been rapidly adopted by numerous vendors — as well as a growing number of organizations —as a means of embracing distributed data processing. Understanding and adopting data mesh remains a challenge, however. Data mesh is not a product that can be acquired, or even a technical architecture that can be built. It is an organizational and cultural approach to data ownership, access and governance. Adopting data mesh requires cultural and organizational change. Data mesh promises multiple benefits to organizations that embrace this change, but doing so may be far from easy.
The term data mesh was coined by Zhamak Dehghani, a principal technology consultant at Thoughtworks in 2019. It was proposed as a new tactic to shift away from centralized approaches to analytics built around monolithic analytic data platforms (such as data warehouses or data lakes) and adopt distributed ownership of data and metadata. Data mesh can be thought of as doing for analytical data what microservices did for functionality: distributing ownership and responsibility to domain experts, who make it available to be consumed by others across the organization. The data mesh concept is based on four key principles: domain-oriented ownership, data as a product, self-serve data infrastructure and federated governance. Domain-oriented ownership gives responsibility to business departments or units to manage the data generated by their applications, including preparing and enriching it for analysis. The principle of data as a product means that those business domains are also responsible for making data available to users in other domains. Data sharing is enabled by self-serve data infrastructure, which allows domains across an organization to share, discover and access data products. Distributed data ownership and sharing requires adherence to agreed standards that ensure usability and enforce data quality, which is supported by a federated approach to governance that involves individual domains as well as regulatory and security subject matter experts and infrastructure platform specialists.
In addition to making organizational and cultural changes, many organizations also need to make technological changes to facilitate the adoption of data mesh. While the approach can be seen as diametrically opposed to centralized analytic platform projects such as data warehouses and data lakes, investment in these data platforms does not need to be abandoned to embrace data mesh. Data warehouse or data lake platforms can still serve a role as data repositories used by individual domains as part of a data mesh. Meanwhile there is some conflation of data mesh with data fabric, which is a technology-driven approach to managing data across distributed environments. Discussing the relationship between data mesh and data fabric is the focus of a separate Analyst Perspective, but certainly technological investment and evolution may be required to facilitate the management and sharing of domain-oriented data products.
Data mesh cannot be delivered through adoption of a single product or data platform. Consequently, one of the benefits of the data mesh is a shift away from defining data strategy in terms of data platforms. Your data warehouse, data lake or data lakehouse is not your data strategy, and is only one part of an organizational and cultural approach to data. Organizations should beware of vendors trying to sell a data mesh platform. That said, I recommend that all organizations consider the potential advantages of data mesh, while also being cognizant of the significant organizational and cultural changes that will be associated with its adoption.
Regards,
Matt Aslett