Share via


What is Lakehouse Federation?

Lakehouse Federation is the query federation platform for Databricks. The term query federation describes a collection of features that enable users and systems to run queries against multiple data sources without needing to migrate all data to a unified system.

There are two types of federation: query federation and catalog federation. This page covers the differences between the types.

Query federation compared to catalog federation

The following table summarizes the key differences between query federation and catalog federation:

Attribute Query federation Catalog federation
Query path Unity Catalog queries are pushed down to the foreign database using JDBC. The query is run both in Databricks and using remote compute. Unity Catalog queries directly access the foreign table in object storage. Catalog federation is available for platforms that support direct access to their catalog and storage services. The query is only run on Databricks compute, meaning that catalog federation is more cost-effective and performance-optimized than query federation.
Use case
  • You need ad hoc reporting or proof-of-concept access to operational data stored in external databases.
  • You want to minimize data movement and maintain live access to external systems.

When your source supports both Lakehouse Federation and Lakeflow Connect, Databricks recommends Lakeflow Connect if performance on higher data volumes and lower latency are priorities.
  • You’re migrating to Unity Catalog but need to incrementally phase in data managed from a foreign catalog.
  • You want a long-term hybrid model in which some data stays in an external catalog and some data is managed by Unity Catalog.
Overview of steps
  • Create a connection in Unity Catalog with your access credentials and JDBC URL.
  • Create a foreign catalog using the connection.
  • Grant privileges to users on tables in the foreign catalog.
  • Run queries. These are pushed down to the external database.
  • Create a connection in Unity Catalog for accessing the external catalog.
  • Create a storage credential and an external ___location for the table paths.
  • Create a foreign catalog using the connection and the external ___location.
  • Grant privileges to users on tables in the foreign catalog.
  • Run queries. These run directly against the object storage.

Supported data sources

Connect to the following sources using query federation:

Connect to the following sources using catalog federation:

Additional resources