Share via


Set up data quality for Fabric mirrored databases

As a data replication solution, mirroring in Fabric is a low-cost and low-latency solution to bring data from various systems together into a single analytics platform. You can continuously replicate your existing data estate directly into Fabric's OneLake, including data from Azure SQL Database, Azure Cosmos DB, and Snowflake.

Mirroring in Fabric allows users to enjoy an end-to-end product that is designed to simplify your analytics needs. Mirroring is a low-cost, low-latency solution that enables you to create a replica of your data in OneLake, making it readily available for all your analytical needs. For more details about Fabric mirroring browse Fabric documentation.

Configure data quality for a Fabric mirrored database

  1. Enable mirroring in your Fabric tenant. Power BI administrators can enable or disable Mirroring for the entire organization or for specific security groups, using the setting found in the Power BI admin portal. Mirroring is enabled by creating a secure connection to your operational data source. You choose whether to replicate an entire database or individual tables and mirroring will automatically keep your data in sync. Once set up, data will continuously replicate into the OneLake for analytics consumption.

  2. After enabled mirroring and initiated replication, confirm that mirroring replication successfully completes.

  3. Open the SQL analytics endpoint.

    Screenshot to navigate sql end point.

  4. On the Reporting tab, select Automatically update semantic model.

    Automatically update semantic model.

  5. Create a Lakehouse in your Fabric workspace if you don't have one created.

  6. Create a Fabric shortcut from that mirrored database to the lakehouse.

  7. Go to Microsoft Purview Data Map and run a Data Map scan on that lakehouse; ignore the mirrored database. Use service principal authentication.

    Use service principal for datamap scan.

  8. When the scan is completed, associate the new data assets (Lakehouse tables) with a data product. Make sure to select the Lakehouse tables to associate to your data product.

  9. After associating mirrored tables as Lakehouse tables to the data product, you can profile and measure the data quality of all mirrored tables as Lakehouse tables in Microsoft Purview.

  10. In the Data quality area of Heath management in Unified Catalog, run a data quality scan or profile your data as usual.

Important

  • Use service principal for data map scans, and use a managed identity for data quality scans.
  • Select the mirrored database instead of individual tables.
  • Update semantic model every time when you add new table to the mirrored database.
  • If your mirrored database tables aren't available in the Fabric Lakehouse, contact Fabric support.
  • Data quality scanning is supported only for Lakehouse delta, Iceberg, and Parquet files format.