Master Data Management Tools for Azure

Written by Andy McDonald
Reviewed by Johannes Åkesson
Nov 07, 2022
7 min read

Maintaining a “single source of truth” is critical for all organizations. The discipline of Master Data Management (MDM) is one of the means to achieve this. In this article, we’re going to look at some Master Data Management tools for Azure.

Introduction

Master Data Management (MDM) is an information management discipline focusing on data quality and data governance. Its goal is to mitigate problems caused when Master Data is inconsistent, fragmented, or out of date. These problems can range from minor irritations, such as duplicate entry of data in systems to larger organizational issues relating to data ownership, integrity and efficiency.

The over-arching goal for MDM is to create a “single version of the truth” for your Master Data (i.e. customers, products, employees, and physical assets). This is normally easy to achieve when organizations are very small, but as they grow, it becomes clear that some sort of strategy needs to be identified to manage data.

Many organizations use Microsoft SQL Server as their database platform and It’s likely that in most cases this will comprise on-premise databases and cloud resources such as Azure SQL. In this article we’re going to look at some master data management options that are suitable for Azure and on-premise SQL Server.

What is Master Data Management?

Master Data Management (MDM) is a discipline that focuses on data quality and data governance in an organization. An MDM initiative typically has the following main themes:

  • It relates to a company’s Master data (e.g. business entities such as customers/suppliers, employees, SKUs, product descriptions, locations)
  • The over-arching goal is to create a “single version of the truth” for your Master Data (e.g. customers, products, employees, and physical assets)
  • It is a set of disciplines, processes, and technologies
  • It covers applications, systems, and databases
  • It spans both business and IT realms
  • There is a focus on continuous data improvement
  • It requires a set of data quality dimensions to be defined (e.g. timeliness, accuracy, completeness, meaning) by the business
  • It often involves unifying data from multiple disparate systems or sources
  • It is hard to do

The last point relates to some of the difficulties associated with implementing MDM. The main reasons for failure of MDM initiatives are lack of executive support and an under-estimation of the complexity of what is involved. The first critical step in the process is therefore to identify what business problems you are trying to solve with MDM. Once this is done you need to make sure that senior stakeholders buy in to the process and give it their full support.

It is also important not to be over-awed by the process, and to start small with a simple strategy with realistic objectives. In a later article, we’ll look at some simple lightweight MDM strategies.

Microsoft Purview: an Azure data governance solution

Overview

In late 2021, Microsoft released Azure Purview (now re-branded as Microsoft Purview) – a cloud-native data governance solution to enable organizations of all sizes to manage and govern their on-premises, cloud, and software as a service (SaaS) data.

Purview includes several features which help you manage and understand data across your enterprise data landscape:

  • Discovery of data to create a unified map of data assets and their relationships
  • Glossary with business and technical search terms to aid data discovery
  • Visibility and insights into the management of sensitive data across the data estate
  • Central management of data sharing and provisioning of data access

Microsoft Purview Governance Portal Overview

Source: https://learn.microsoft.com/en-us/azure/purview/overview

Architecture

Purview is essentially a collection of 4 different apps – these are listed below:

App/Component Description
Data Map Makes your data meaningful by graphing your data assets, and their relationships, across your data estate.
Data Catalog The data catalog aligns your assets with friendly business terms and data classification to identify data sources.
Data Estate Insights Gives you an overview of your data estate to help you discover what kinds of data you have and where, including sensitive data
Data Sharing Allows you to securely share data internally or across organizations with business partners and customers.

Data Map

Microsoft Purview Data Map is essentially the foundation on which the Purview governance portal is built. It’s a cloud native service that captures metadata about your enterprise data that is present in analytics and operation systems (both on-premises and cloud).

It is automatically kept up to date with built-in automated scanning and classification system. Business users can configure and use the Data Map through a UI and developers can programmatically interact with it using open-source APIs.

Data Catalog

Microsoft Purview Data Catalog is used by business and technical users to find relevant data using a search with filters based on areas such as glossary terms, classifications, sensitivity labels, etc.

Data owners and stewards can manage the business glossary by adding or updating tags, definitions, related terms, synonyms and owners. They can also set rules to automate the tagging of data assets with glossary terms.

Data consumers and producers can also visually trace the lineage (i.e. it’s origin and movement over time) of data assets.

Data Estate Insights

Microsoft Purview Data Estate Insights is a dashboard and reporting tool aimed at governance stakeholders that need to get a bird’s eye view of their data estate to identify any governance gaps. It also allows data owners and data stewards to drill-down to identify the areas when action is required to close any governance gaps that are flagged.

Data Sharing

Microsoft Purview Data Sharing enables organizations to securely share data with people in the organization or with business partners and customers. The data is shared in-place and so there is no need for the traditional route of generating data feeds and pipelines.

Data providers can centrally manage and monitor data sharing relationships, and revoke sharing at any time. Data consumers can have near real-time access to shared data and process it with their own analytics tools and turn data into insights on their side.

Implementing Microsoft Purview with a third party application

As a data governance system, Purview is good at cataloging data sources and defining data standards, but it has no means to enforce those standards for you -this is where MDM comes in. There are several MDM vendors that have integrated their MDM platforms tightly with Purview, the most prominent ones being Profisee and CluedIn.

Microsoft Purview + Profisee

Profisee is a purpose-built MDM application which integrates tightly with Microsoft Purview. Here is a quick overview of how this combination works.

The process begins with the discovery and scanning of line-of-business (LOB) systems to identify and establish an initial set of data assets in Purview.

A data modeler then creates a master data model in Profisee using the enterprise metadata discovered by Purview and enhancing it with additional information. Note that the metadata associated with the data model is published back into Purview.

The source data is then loaded into the master data model in Profisee, using ETL processes such as Azure Data Factory (ADF) pipelines. Once the data has been loaded, a set of golden (or master) records that represent the best information that’s available across disparate LOB systems is created.  Profisee’s data quality rules engine can also contribute to data enrichment and quality improvement. As the master metadata details are published to Purview, the governance team can enhance it with enterprise-level details that help with usage and stewardship, such as glossary entries, supporting resources, data classification, sensitivity identification, ownership, and subject-matter expertise.

Enriched governance information is made available to data stewards through Profisee’s FastApp Portal, which allows stewards to make good decisions, when faced with quality issues and/or matching conflict resolution challenges. Owners and experts can be quickly identified and contacted, thus fostering collaboration between stewards, business users, and data owners.

Finally, after quality checks and stewardship activities have been completed, high-quality master data can be supplied to the business.

Microsoft Purview Profisee Process

Source: https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/data/profisee-master-data-management-purview

Microsoft Purview + CluedIn

CluedIn is a purpose-built MDM application, which, like Profisee, is part of the Microsoft Intelligent Data Platform Partner Ecosystem.

The concept of using Purview as the basis for CluedIn’s MDM solution is very similar to that of Profisee. The processes that are used to discover, scan, model, clean and enrich the data are by and large the same as that described above.

Microsoft Purview Costs

The pricing for Microsoft Purview is on a subscription-based Pay-As-You-Go model. Although the concept of only paying for what you use is a good one, the pricing model can get quite confusing, as you need to consider both direct and indirect costs across the various components of Purview. If you are feeling strong, you can check out the details of the pricing model here.

Of course, there are also license costs associated with both the Profisee and CluedIn solutions – contact the vendors for pricing information.

SQL Spreads: a light alternative for MDM tasks

If you are just starting out on your MDM journey, or if you simply need a simple way for your users to update business data and not a full-blown MDM solution, then the SQL Spreads Excel Add-In could be a good option.

SQL Spreads makes it easy for non-technical users to connect to data tables in SQL Server (both on-premise and Azure SQL) and view data and make updates, all within the familiar user environment of Excel.

SQL Spreads Master Product Data

In addition to being able to make updates to data in SQL Server, SQL Spreads also has features such as data validation and change tracking so that these important aspects can be managed from Excel, rather than SQL Server Management Studio, allowing a wider range of users to use them.

In a future article we’ll run through a simple demo of how to perform some simple MDM tasks in SQL Spreads, but in the meantime, if you have any questions about how to start an MDM initiative, please contact us. You can also download a free trial of SQL Spreads and see how easy it is to manage your SQL data from within Excel.

Related Articles

Try SQL Spreads for free

Try SQL Spreads during 14 days for free. The free trial is the fully
functional, time-limited only product.
Try Excel Add-In for Free download-image
stars-image
14-days Free Trial. No credit card required.