Share via


Create and manage data products

A data product in Microsoft Purview Unified Catalog is a set of information with a defined use case that you share with other users. For example, a data product can be a sales report, an ML model, or a data model and its associated tables. A data product is a logical grouping of related physical assets that you create for a specific purpose.

It can be hard to find the value of data presented by itself, but data already associated with a purpose is easier to find and use. Data products in Unified Catalog provide practical context for your users and for intelligence systems, to help users identify what data is useful to them.

To create data products for your organization and ensure those products have enough definition to be useful to your user base, follow these instructions.

Prerequisites

View data products

  1. In the Microsoft Purview portal, open Unified Catalog.
  2. Under Catalog management, select Data product. You see a list of all the data products you have access to based on your permissions. You can scroll, sort, filter, and search through these data products to find the ones you're looking for.
  3. Select a Data product name from the list to view more details on the data project's details page.

Filter your view by custom attribute

You can use business concept attributes as a filter when exploring data products.

  1. Under Catalog management, select Data products.
  2. Select Add filter.
  3. At Filter, select an attribute name from the dropdown list.
  4. At Operator, select a condition, such as Equals or Starts with, which varies based on the kinds of values allowed by the attribute.
  5. At Value, enter your value.
  6. Select Apply.

Details

The Details tab on the data product detail page provides a description and use cases. Here you find the governance ___domain, update frequency, status, owner, subscribers, terms of use, aggregate data quality score, health actions, and documentation.

  1. You see a list of all the data products you have access to based on your permissions.
  2. You can scroll, sort, filter, and search through these data products to find the ones you're looking for.
  3. For more details about a specific data product, select that data product.
  4. On the data product's detail page, you can view the governance ___domain, update frequency, status, owner, subscribers, terms of use, aggregate data quality score, health actions, and documentation.

Custom attributes

Under Custom attributes, you can see any custom attribute groups and attributes within groups, and the values set for each attribute. Only attributes with values are shown by default, but you can toggle Show attributes without a value on to view all attributes.

Data observability

The Data observability (preview) tab provides a more detailed view into data health and lineage. This view requires a brief setup process. Get details about the observability view for governance domains.

Create data products

Important

To manage or edit data products, users need at least data product owner permissions. You can only create data products in governance domains where you have data product owner access.

  1. In Unified Catalog, select Catalog management, then select Data products, then select New data product.

  2. On the Basic details page, enter a Name and Description. If you use a name that already exists, you'll see a warning during the creation process. While duplicating names isn't recommended, you won't be blocked from using a duplicate name. The description should be a business narrative about the data, where it came from, and why it was captured. The goal is for a new user to understand the basics of when, what, why, and how the data came into existence and provide clarity about the meaning of the specific data to the business.

  3. Select a Type from the dropdown menu. Get details about data product types.

  4. Make an optional selection from the Audience dropdown list.

  5. At Owner, and any other owners for the data product.

  6. Select Next.

  7. On the Business details page, select the governance ___domain your data product should be associated with.

  8. Enter a Use case. The use case describes what the data is used for today and how a user can effectively apply it to their own scenario. Information like what filters or dimensions are available or which asset contains the view that's easiest to parse, or any other details that can accelerate the usage of the data can be added to the use case. If the data is only appropriate for specific purposes, include these details so users don't request access for data that they won't be approved for or won't help them to achieve their goals.

  9. If you decide to mark the data product as Endorsed, select the Mark as Endorsed box. Learn more about endorsing a data product.

  10. Select Next.

  11. On the Custom attributes page, you can set values for all custom attributes defined by your admin in the Custom attributes section of the catalog. If the attribute is marked as required, you need to fill out all values on this page before you can complete the process.

  12. Select Create.

You arrive at the new data product's details page. This is where you can make other edits to the data product, which is currently in a draft state.

Important

The data product isn't visible to other users yet. For it to be visible to other users, you need to add data assets, create an access policy, and then publish the data product.

Data product types

When you create a data product, you can optionally identify it as one of the following types. These types help identify the kinds of users that might want to use the data product, and you can use them to filter data products in a search.

  • Analytics model: Data that's transformed from raw data to help build analytics from, and may be distinct fact and dimension tables.

  • Business system/Application: A common set of consumable tables from a specific source, or available in a certain application that you consume together.

  • Dashboards/Reports: Data visualizations that you use to make decisions and view common metrics; might or might not include the data used to power the visualization.

  • Dataset: An individual asset or group of assets used as a common data product type to make data discoverable for specific use cases.

  • Master and reference data: Highly reusable data that you should consistently reference and highly control due to its common use and greater importance to the health of the data estate.

  • ML training data: Data approved for use as training a new model; is typically not production data and is desensitized to prevent potential risks.

  • ML testing data: Data built to specifically test ML models for edge cases, fairness, and potential harms.

  • Model types: Typically used for AI models and analytical or semantic models used with AI.

  • Operational: A data product that you use for data quality, or to operate a portion of the data estate but not planned for high consumption or governance. It could be an investigatory data product of transitory data assets or data used to run a process, but isn't modeled or consolidated for analytical purposes.

  • Semantic model: A group of related data tables and their relationships and meanings for interpretation and reuse.

  • Transactional data: Data from an originating source that might have limited use cases, but is required for engineering scenarios or system-to-system integrations; not typically used for most business data consumption, but could be important for governance and control scenarios.

Edit and manage data products

Data product owners can view and modify the properties of a data product by following these steps.

  1. In Unified Catalog, select Catalog management, then select Data products.
  2. Select the data product to open its details page.
  3. Select Edit.

From the data product page, you can:

Edit data product

To edit a data product, you need data product owner permissions. You can only edit a data product when it's in Draft status.

  1. On your data product page, select the Edit button.
  2. On the Basic details page, update the name, description, type, and owners. If you change the name to one that already exists, you'll see a warning about the duplication but you can proceed to use the duplicate name.
  3. Select Save.
  4. On the Business details page, update use cases and endorsement status.
  5. Select Save.
  6. On the Custom attributes page, you can set values for all custom attributes defined by your admin in the Custom attributes section of the catalog. If the attribute is marked as required, you need to fill out all values on this page before you can complete the process.
  7. To save your changes, select Save.

Update contacts

When you add owners to data products during the creation or editing process, the owners automatically appear in the Contacts section. You can customize the label by selecting the pencil icon in each contact card.

Update frequency

Use the update frequency to show how regularly you manage a data product. This indicator isn't currently automated.

  1. To edit the update frequency, select the Update frequency attribute on the right side of the page.
  2. Select the update frequency you expect for your data product.
  3. Select Done.

Publish, draft, and expire

To publish a drafted data product, select Publish on the data product's details page. Before you can publish, you need to add data assets to your data product, and set up a data access policy so users can request access to your data product.

Note

Ensure your governance ___domain is published before you publish your data products.

After completing the requirements, you can publish your data product, which makes the data product available for users in Unified Catalog.

You can also set a previously published data product to a draft by unpublishing it:

  1. Select the Unpublish button in the data product.
  2. Select Set to draft which allows only stewards, governance ___domain owners, and data product owners to view and manage it.

You can also set your data product to Expired:

  1. Select the Unpublish button in the data product.
  2. Select Set to expired which makes it visible only to stewards, governance ___domain owners, and data product owners.

Manage data product access policies

To manage access policies, you need data product owner or data steward permissions.

  1. If your data product is published, first unpublish it to manage policies.

  2. On your data product page, select Manage policies.

  3. From the policy configuration window, you can create and manage your data product's access policy. Learn how to set up data product access policies.

Add and remove data assets

A data product groups together data assets. Grouping assets in this way makes it easier to discover them. View the instructions in the following sections for adding data assets to, or removing assets from, a data product.

Add data assets

Note

When adding data assets, only assets that the governance ___domain is scoped for, or that the user has access to in the Data Map, appear in the search.

To add data assets to a data product:

  1. In Unified Catalog, select Catalog management, then select Data products.

  2. Select the data product you want to add assets to.

  3. Select Add data assets under the description and use cases.

  4. Search for your data assets by using keywords or filters. To add filters, select Add filter.

  5. Select any assets you want to add to the data product.

  6. Select Selected assets to edit your selected asset list.

  7. When you're done selecting assets, select Add.

You see your newly added assets in your data product.

Remove data assets

Tip

Collaborate with your data quality team to perform necessary cleanup work, as described in the following section, before removing data assets and deleting data products.

Before you can remove a data asset from a data product, you must remove all data quality rules and previous data quality scans from the asset. For more information, see deleting data quality history.

Although data quality stewards may not be involved in creating data products, they're the subject matter experts in data quality rules and scans. Only users holding the Data Quality Steward role can remove data quality rules and scans, so data product owners should work with data quality stewards to get rules and scans removed before the asset in question can be removed from the data product.

Follow these steps to remove a data asset:

  1. In Unified Catalog, select Catalog management, then select Data products.

  2. Select the data product you want to remove assets from.

  3. If the assets you want to remove are on the front page, select the ellipsis (...) on the data asset's card, then select Remove.

  4. If you don't see the assets on the main product page, select View all data assets. There you're able to search and filter the assets. To remove them, select their ellipsis button and select Remove.

    Note

    The remove action is greyed out if the data asset has data quality rules running on it or failed data quality run history. You need the Data Quality Steward role to resolve these issues before the asset can be removed.

Manage linked resources

You can link glossary terms and OKRs directly to your data products to improve understanding, apply policies, and associate your data products with your business goals.

When you map assets in your data product to critical data elements, the system automatically adds critical data elements.

Add linked resources

  1. On the data products page, select the data product you want to add terms or OKRs to.
  2. Under Glossary terms or OKRs, select the + button next to the listing of those items.
  3. Search for your terms or ORKs by using keywords, governance ___domain, or filters by selecting Filter.
  4. Select any terms or OKRs you want to add to the data product.
  5. Select the Add button.
  6. You can now see the term or OKR in the list for your data products.

Remove linked resources

  1. On the data products page, select the data product you want to remove terms or OKRs from.
  2. Under Glossary terms or OKRs, find the item you want to remove, and select its ellipsis button.
  3. Select Remove.

Note

The system automatically adds critical data elements based on the assets in your data product. Find more information about managing critical data elements.

Terms of use

  1. On the data products page, select the data product you want to manage terms of use for.
  2. Select the Terms of use attribute, on the right side of the page.
  3. To add more terms of use:
    1. Select + Add link.
    2. Choose a particular data asset to link the use to (optional).
    3. Enter a friendly name for the terms.
    4. Enter the link to the terms of use.
    5. Select Create.
    6. Select Done.
  4. To remove any terms of use:
    1. Hover over the term you want to remove.
    2. Select the trashcan remove button.
  5. When you're finished, select Done.

Documentation

  1. On the data products page, select the data product you want to manage documentation for.
  2. Select the Documentation attribute, on the right side of the page.
  3. To add documentation:
    1. Select Add link.
    2. Choose a particular data asset to link the documentation to (optional).
    3. Provide a friendly name for the documentation.
    4. Provide the link to the documentation.
    5. Select Create.
    6. Select Done.
  4. To remove any documentation:
    1. Hover over the documentation you want to remove.
    2. Select the trashcan Remove button.
  5. When you're finished, select Done.

Endorse a data product

As a Microsoft Purview Unified Catalog grows in size, it's important for data consumers to understand what data they can trust. Data consumers need to know if a data product meets their organization's quality standards and can be regarded as reliable.

Data product owners can now set the 'Endorsed' flag for their data products to indicate that they certified their data products, and build confidence in the quality of their data product.

To endorse a data product, you need data product owner permissions.

  1. On your data product page, select Edit.
  2. Select Details.
  3. Select the Mark as 'Endorsed' checkbox.
  4. Select Save.

Delete data products

Deleting a data product requires planning, as several steps need to happen before the deletion.

  1. First, unpublish the data product.

    1. Also, remove links to all related business concepts and remove any data assets within the data product.
    2. If any assets have data quality rules running on them or failed data quality run history, delete the data quality history.
  2. Then, inform subscribers of the data product about the upcoming deletion and delete their access request to the data product.

  3. When you complete all tasks, select Delete on the data product's page to delete the data product.

Next steps