This put up is cowritten by Tommaso Paracciani and Oghosa Omorisiagbon from HEMA.
Information has change into a useful asset for companies, providing vital insights to drive strategic decision-making and operational optimization. Nonetheless, many firms right this moment nonetheless wrestle to successfully harness and use their information resulting from challenges resembling information silos, lack of discoverability, poor information high quality, and an absence of knowledge literacy and analytical capabilities to rapidly entry and use information throughout the group. To deal with these rising information administration challenges, AWS prospects are utilizing Amazon DataZone, an information administration service that makes it quick and easy to catalog, uncover, share, and govern information saved throughout AWS, on-premises, and third-party sources.
HEMA is a family Dutch retail model title since 1926, offering day by day comfort merchandise utilizing distinctive design. HEMA’s greater than 17,000 staff deliver unique, sustainably designed merchandise in additional than 750 shops within the Netherlands but in addition in Belgium, Luxembourg, France, Germany, and Austria, with webstores accessible in all these international locations. HEMA constructed its first ecommerce system on AWS in 2018 and 5 years later, its builders have the liberty to innovate and construct software program quick with their alternative of instruments within the AWS Cloud. At this time, that is powering each a part of the group, from the customer-favorite on-line cake customization function to democratizing information to drive enterprise perception.
This put up describes how HEMA used Amazon DataZone to construct their information mesh and allow streamlined information entry throughout a number of enterprise areas. It explains HEMA’s distinctive journey of deploying Amazon DataZone, the important thing challenges they overcame, and the transformative advantages they’ve realized since deployment in Might 2024. From establishing an enterprise-wide information stock and enhancing information discoverability, to enabling decentralized information sharing and governance, Amazon DataZone has been a sport changer for HEMA.
Information panorama at HEMA
After shifting its whole information platform from on premises to the AWS Cloud, the wave of change introduced a novel alternative for the HEMA Information & Cloud perform to speculate and commit in constructing an information mesh.
HEMA has a bespoke enterprise structure, constructed across the idea of companies. These companies are particular person software program functionalities that fulfill a particular goal inside the firm. Every service is hosted in a devoted AWS account and is constructed and maintained by a product proprietor and a improvement crew, as illustrated within the following determine.

HEMA runs over 400 companies, and 20 of them run extract, remodel, and cargo (ETL) pipelines with devoted information sources, which produce and devour information property shared throughout the info mesh.
Information administration in an information mesh
Weeks after launch, HEMA’s information platform wasn’t the place the corporate wished it to be. Constructing an agile group that runs on dependable and streamlined processes was the first objective. Initially, the info inventories of various companies have been siloed inside remoted environments, making information discovery and sharing throughout companies guide and time-consuming for all groups concerned.
Implementing sturdy information governance is difficult. In an information mesh structure, this complexity is amplified by the group’s decentralized nature. On this context, HEMA concluded that information governance was not a nice-to-have, however had change into a foundational piece required to construct a wholesome information group.
Why HEMA chosen Amazon DataZone
By exploring the preview, HEMA noticed how Amazon DataZone lined all of the vital pillars of knowledge administration in a single answer. It was clear how Amazon DataZone would deliver profit to each the technical groups in addition to the enterprise end-users. The technical group might benefit from a sturdy programmatic answer to handle the provision, accessibility, and high quality of the info property that make the enterprise information catalog. The enterprise end-users got a instrument to find information property produced inside the mesh and seamlessly self-serve on their information sharing wants.
Options resembling AI-generated metadata have been key to offering end-users with dependable and use case-driven explanations of what a sure information product might present and clear up, whereas the subscription function allowed them to start out utilizing a sure information asset inside their very own surroundings in a matter of seconds, versus the present prolonged and human-driven course of.
These causes, in addition to the self-service capabilities, resulted in HEMA’s determination to undertake and roll out Amazon DataZone on the enterprise stage.
Resolution overview
The HEMA information panorama is multifaceted, with numerous groups throughout the group utilizing a spread of applied sciences and methods, together with Databricks. To successfully govern this advanced information surroundings, HEMA has adopted an information mesh structure on AWS. This structure maintains a central intelligence platform (CIP) that allows the actions of each information producers and information shoppers by offering the required platform and infrastructure. The general construction may be represented within the following determine.

Every service makes use of two AWS accounts, one for pre-production and one for manufacturing. This separation means modifications may be examined completely earlier than being deployed to reside operations.
Amazon DataZone is the central piece on this structure. It helps HEMA centralize all information property throughout disparate information stacks right into a single catalog. It performs a pivotal position in bridging the hole and integrating totally different methods, resembling Databricks and native AWS companies. The combination of Databricks Delta tables into Amazon DataZone is completed utilizing the AWS Glue Information Catalog. Delta tables’ technical metadata is saved within the Information Catalog, which is a local supply for creating property within the Amazon DataZone enterprise catalog. Entry management is enforced utilizing AWS Lake Formation, which manages fine-grained entry management and information sharing on information lake information. The next determine illustrates the info mesh structure.

The Amazon DataZone implementation follows the identical strategy as particular person companies: HEMA maintains two distinct area information catalogs: preprod-hema-data-catalog and prod-hema-data-catalog. These catalogs function the spine for information sharing throughout pre-production and manufacturing accounts, permitting versatile entry to information property primarily based on the surroundings’s wants.
The prod-hema-data-catalog is the production-grade catalog that helps information sharing throughout manufacturing companies and, in some circumstances, pre-production companies. This catalog solely facilitates the manufacturing of knowledge property from manufacturing companies (disallows publishing of property belonging to pre-production companies) and permits pre-production companies to entry production-grade information. The next diagram illustrates the structure of each accounts.

To determine isolation between companies within the information mesh, a mission is devoted to a novel service account. The surroundings profiles and environments are configured to be explicitly used solely by the service. This Amazon DataZone configuration is managed centrally by the core crew utilizing AWS CloudFormation. After initiatives are created and configured by the central crew, mission groups have entry to self-service capabilities to create their very own environments based on their wants.
The next diagram illustrates the complete workflow for onboarding HEMA service groups in Amazon DataZone.

The workflow contains the next steps:
- A service crew (both an information producer or an information shopper) initiates a request to the core information platform crew to allow information sharing for his or her service accounts. This request is often made when a service crew has a use case the place they should both publish information to the catalog (for different groups to devour) or entry information that one other crew has revealed.
- After the request is acquired, the core information platform crew assesses the necessities and initiates the creation of initiatives and environments in Amazon DataZone. That is finished utilizing AWS CloudFormation and a steady integration and supply (CI/CD) pipeline. The core information platform crew makes certain that the suitable AWS account (pre-production or manufacturing) is linked to the surroundings inside the mission within the respective catalogs.
- After the initiatives and environments are arrange, service groups can use Amazon DataZone options to provide and devour information property:
- Producers (for instance, Service A) can publish their information property to the Information Catalog and approve or reject subscription requests.
- Shoppers (for instance, Service B) can search and entry these revealed information property utilizing the Amazon DataZone catalog and request information entry by subscription requests.
 
In a decentralized information mesh surroundings, there’s a danger of service groups creating sources in service accounts they aren’t licensed to handle, which can result in governance points and information mismanagement. To deal with this problem, HEMA adopted two ideas:
- Amazon DataZone mission construction – Every mission accommodates sources which can be solely managed by the service crew (mission members) answerable for it. Every service crew’s mission supplies a transparent boundary for the sources they handle.
- Setting isolation – The core groups implement governance insurance policies within the Amazon DataZone configuration, permitting groups to solely deploy sources inside their very own environments.
Adoption plan: Technique
In HEMA’s information mesh, the catalog have to be in-built collaboration with all of the companies that produce information, so the important thing for the central information governance crew was ideating an adoption plan that might add worth to those groups, relatively than disrupting the supply of their initiatives. With that in thoughts, HEMA’s adoption technique was designed on three core ideas:
- Launch it – Don’t wait till you’ll be able to ship to manufacturing a full-scale service that covers each single function accessible. As a substitute, outline an MVP that solves essentially the most vital want for the enterprise and make it accessible for the enterprise as quickly as you’ll be able to.
- Show worth – HEMA’s information crew ran a number of inside seminars, and devoted shows with every of the concerned groups to showcase how Amazon DataZone would simplify their information sharing wants. Don’t inform them they need to make investments time to study and begin utilizing a brand new service, however relatively allow them to get drawn in by the brand new benefits of the brand new performance and stimulate self-adoption.
- Be there – This connects with what HEMA as an organization stands for. Be near the groups once they want assist throughout the adoption stage, like HEMA is near their prospects at any time when they want a brand new product for his or her lives. Create house for Q&A and develop a collaborative expertise for everybody of their adoption curve.
Adoption plan: Motion factors
Whereas deploying the adoption plan for a decentralized information market utilizing Amazon DataZone, HEMA adopted a “begin small, fine-tune, and iterate” strategy. In follow, this meant that the Information & Cloud crew began working with one enterprise unit, increasing then to a number of enterprise models, whereas specializing in one single function: information asset subscription. To extend curiosity and adoption, this course of was launched for the core information property that have been extra used within the firm.
After this a part of the method was effectively understood and embraced by everybody, the subsequent step was to start out supporting the info pipeline adaptation work wanted for every enterprise unit.
Lastly, when all groups have been onboarded and aware of the subscription function, HEMA moved to introduce the enterprise models to the second vital function: information publishing. In abstract, HEMA launched new options and allowed the domains to choose up the implementation at their most well-liked tempo earlier than shifting onto the subsequent one.
When adoption was at a degree the place all core information property have been being consumed by the Amazon DataZone catalog, the Lake Formation useful resource hyperlinks used beforehand to share information throughout accounts have been decommissioned, and on the identical time the Information & Cloud crew interrupted their responsibility to share information between enterprise models, stimulating the peer-to-peer information sharing follow, the place groups can straight speak to one another with out having to contain a 3rd get together.
Outcomes
The recognition of Amazon DataZone throughout the enterprise ramped up rapidly, and all of the concerned enterprise models began utilizing the service day by day to self-serve their wants. The existence of a central information catalog enabled groups to seamlessly search, uncover, share, and subscribe to information property produced inside the enterprise. Only some months after launching the service, HEMA noticed gorgeous statistics:
- Over 200 information property revealed to the catalog
- Over 180 lively subscriptions
- Over 100 lively customers month-to-month
- Over 20 enterprise models (companies) onboarded
- Information sharing common turnaround time reduce from 4 working days to few seconds, with out the assist of another crew
Moreover, they noticed large advantages that may’t be represented by statistics. Above all, the power to autonomously uncover information produced by different groups is enabling a sequence of latest use circumstances for the enterprise, which weren’t even seen to them earlier because of the lack of know-how and visibility on what others have been producing. For instance, the info science crew rapidly developed a brand new predictive mannequin for gross sales by reusing information already accessible in Amazon DataZone, as an alternative of rebuilding it from scratch. That is leading to an energized information group, which may collaborate and contribute to shaping the way forward for HEMA’s information operations.
Conclusion
At HEMA, Amazon DataZone made information governance a actuality, and so the corporate desires to implement new options in shut collaboration with AWS, whereas persevering with to work on the rollout of things which can be already in HEMA’s roadmap. The crew is repeatedly creating the service, launching a sequence of latest options that can proceed to enhance the info operations:
- Information high quality scores – This function helps information producers monitor and optimize their information property, whereas shoppers can see upfront the nuances of a sure asset that they is perhaps utilizing or need to use inside their ETL pipelines
- Information lineage – This function permits shoppers and the central governance crew to hint information sources, transformation levels, and observe cross-organizational utilization of knowledge property
- Superb-grained entry management – This function allows producers to be in full management of what they share with different models, ensuring that solely the related items of an information asset are shared with the consuming groups
The long-term imaginative and prescient of HEMA is obvious: Amazon DataZone is about to change into the central answer for information sharing and information cataloging throughout the enterprise. Though as of right this moment, Amazon DataZone is concentrated on supporting the groups operating ETL pipelines, the objective is to increase the service to all of the enterprise groups that work with information, with the final word objective of streamlining their day by day operations. Information is likely one of the most precious sources an organization has, and HEMA is set to democratize its position by constructing an environment friendly information group, who depends on essentially the most superior information governance answer in the marketplace.
In regards to the authors
 Luis Campos is the Information & AI Governance GTM Lead for the EMEA market at AWS the place he helps prospects with their information methods beginning with robust information governance and makes use of his experience in end-to-end information & analytics administration. Luis can also be a public talking coach, primarily based within the Netherlands, and has two boys with 18 years aside, which has taught him to see issues from each ends of a spectrum.
Luis Campos is the Information & AI Governance GTM Lead for the EMEA market at AWS the place he helps prospects with their information methods beginning with robust information governance and makes use of his experience in end-to-end information & analytics administration. Luis can also be a public talking coach, primarily based within the Netherlands, and has two boys with 18 years aside, which has taught him to see issues from each ends of a spectrum.
 Vincent Gromakowski is a Principal Analytics Options Architect at AWS the place he enjoys fixing prospects’ information challenges. He makes use of his robust experience on analytics, distributed methods and useful resource orchestration platform to be a trusted technical advisor for AWS prospects.
Vincent Gromakowski is a Principal Analytics Options Architect at AWS the place he enjoys fixing prospects’ information challenges. He makes use of his robust experience on analytics, distributed methods and useful resource orchestration platform to be a trusted technical advisor for AWS prospects.
 Tommaso is the Head of Information & Cloud Platforms at HEMA. He joined the enterprise with the objective of modernising the Information Group by constructing cloud-based Information Platform – hosted in AWS – which might energy a Information Mesh structure. With a powerful ardour for each technical and organizational challenges, Tommaso leads the Resolution Structure efforts in addition to all core Information Administration and Information Governance initiatives, for which he’s additionally a passionate public speaker. Outdoors the workplace, Tommaso is a full-time dad with a ardour for touring and sports activities.
Tommaso is the Head of Information & Cloud Platforms at HEMA. He joined the enterprise with the objective of modernising the Information Group by constructing cloud-based Information Platform – hosted in AWS – which might energy a Information Mesh structure. With a powerful ardour for each technical and organizational challenges, Tommaso leads the Resolution Structure efforts in addition to all core Information Administration and Information Governance initiatives, for which he’s additionally a passionate public speaker. Outdoors the workplace, Tommaso is a full-time dad with a ardour for touring and sports activities.
 Oghosa Omorisiagbon is a Senior Information Engineer at HEMA. He focuses on leveraging AWS-native instruments to optimise information pipelines, modernise HEMA’s information infrastructure and introduce dependable and scalable end-to-end information structure options. Outdoors of labor, he enjoys touring, enjoying video video games and outside actions.
Oghosa Omorisiagbon is a Senior Information Engineer at HEMA. He focuses on leveraging AWS-native instruments to optimise information pipelines, modernise HEMA’s information infrastructure and introduce dependable and scalable end-to-end information structure options. Outdoors of labor, he enjoys touring, enjoying video video games and outside actions.

