Databricks Asset Bundles resources

2025-10-08

Databricks Asset Bundles allows you to specify information about the Azure Databricks resources used by the bundle in the resources mapping in the bundle configuration. See resources mapping and resources key reference.

This article outlines supported resource types for bundles and provides details and an example for each supported type. For additional examples, see Bundle configuration examples.

Tip

To generate YAML for any existing resource, use the databricks bundle generate command. See databricks bundle generate.

Supported resources

The following table lists supported resource types for bundles. Some resources can be created by defining them in a bundle and deploying the bundle, and some resources can only be created by referencing an existing asset to include in the bundle.

Resources are defined using the corresponding Databricks REST API object's create operation request payload, where the object's supported fields, expressed as YAML, are the resource's supported properties. Links to documentation for each resource's corresponding payloads are listed in the table.

Tip

The databricks bundle validate command returns warnings if unknown resource properties are found in bundle configuration files.

Resource	Corresponding REST API object
app	App object
cluster	Cluster object
dashboard	Dashboard object
database_catalog	Database catalog object
database_instance	Database instance object
experiment	Experiment object
job	Job object
model (legacy)	Model (legacy) object
model_serving_endpoint	Model serving endpoint object
pipeline	Pipeline object
quality_monitor	Quality monitor object
registered_model (Unity Catalog)	Registered model object
schema (Unity Catalog)	Schema object
secret_scope	Secret scope object
sql_warehouse	SQL warehouse object
synced_database_table	Synced database table object
volume (Unity Catalog)	Volume object

app

Type: Map

The app resource defines a Databricks app. For information about Databricks Apps, see Databricks Apps.

To add an app, specify the settings to define the app, including the required source_code_path.

Tip

You can initialize a bundle with a Streamlit Databricks app using the following command:

databricks bundle init https://github.com/databricks/bundle-examples --template-dir contrib/templates/streamlit-app

apps:
  <app-name>:
    <app-field-name>: <app-field-value>

Key	Type	Description
`budget_policy_id`	String	The budget policy ID for the app.
`config`	Map	Deprecated. Define your app configuration commands and environment variables in the `app.yaml` file instead. See Configure a Databricks app.
`description`	String	The description of the app.
`name`	String	The name of the app. The name must contain only lowercase alphanumeric characters and hyphens. It must be unique within the workspace.
`permissions`	Sequence	The app's permissions. See permissions.
`resources`	Sequence	The app compute resources. See apps.name.resources.
`source_code_path`	String	The `./app` local path of the Databricks app source code. This field is required.
`user_api_scopes`	Sequence	The user API scopes.

apps.name.resources

Type: Sequence

The compute resources for the app. See resources in the Azure Databricks API reference.

Key	Type	Description
`description`	String	The description of the app resource.
`database`	Map	The settings that identify the Lakebase database to use.
`job`	Map	The settings that identify the job resource to use.
`name`	String	The name of the app resource.
`secret`	Map	The settings that identify the Azure Databricks secret resource to use.
`serving_endpoint`	Map	The settings that identify the model serving endpoint resource to use.
`sql_warehouse`	Map	The settings that identify the SQL warehouse resource to use.
`us_securable`	Map	The settings that identify the Unity Catalog volume to use.

Example

The following example creates an app named my_app that manages a job created by the bundle:

resources:
  jobs:
    # Define a job in the bundle
    hello_world:
      name: hello_world
      tasks:
        - task_key: task
          spark_python_task:
            python_file: ../src/main.py
          environment_key: default

      environments:
        - environment_key: default
          spec:
            environment_version: '2'

  # Define an app that manages the job in the bundle
  apps:
    job_manager:
      name: 'job_manager_app'
      description: 'An app which manages a job created by this bundle'

      # The ___location of the source code for the app
      source_code_path: ../src/app

      # The resources in the bundle which this app has access to. This binds the resource in the app with the bundle resource.
      resources:
        - name: 'app-job'
          job:
            id: ${resources.jobs.hello_world.id}
            permission: 'CAN_MANAGE_RUN'

The corresponding app.yaml defines the configuration for running the app:

command:
  - flask
  - --app
  - app
  - run
  - --debug
env:
  - name: JOB_ID
    valueFrom: 'app-job'

For the complete Databricks app example bundle, see the bundle-examples GitHub repository.

cluster

Type: Map

The cluster resource defines a cluster.

clusters:
  <cluster-name>:
    <cluster-field-name>: <cluster-field-value>

Key	Type	Description
`apply_policy_default_values`	Boolean	When set to true, fixed and default values from the policy will be used for fields that are omitted. When set to false, only fixed values from the policy will be applied.
`autoscale`	Map	Parameters needed in order to automatically scale clusters up and down based on load. See autoscale.
`autotermination_minutes`	Integer	Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.
`aws_attributes`	Map	Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used. See aws_attributes.
`azure_attributes`	Map	Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used. See azure_attributes.
`cluster_log_conf`	Map	The configuration for delivering spark logs to a long-term storage destination. See cluster_log_conf.
`cluster_name`	String	Cluster name requested by the user. This doesn't have to be unique. If not specified at creation, the cluster name will be an empty string.
`custom_tags`	Map	Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to `default_tags`. See custom_tags.
`data_security_mode`	String	The data governance model to use when accessing data from a cluster. See data_security_mode.
`docker_image`	Map	The custom docker image. See docker_image.
`driver_instance_pool_id`	String	The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.
`driver_node_type_id`	String	The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as `node_type_id` defined above. This field, along with node_type_id, should not be set if virtual_cluster_size is set. If both driver_node_type_id, node_type_id, and virtual_cluster_size are specified, driver_node_type_id and node_type_id take precedence.
`enable_elastic_disk`	Boolean	Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space. This feature requires specific AWS permissions to function correctly - refer to the User Guide for more details.
`enable_local_disk_encryption`	Boolean	Whether to enable LUKS on cluster VMs' local disks
`gcp_attributes`	Map	Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used. See gcp_attributes.
`init_scripts`	Sequence	The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. See init_scripts.
`instance_pool_id`	String	The optional ID of the instance pool to which the cluster belongs.
`is_single_node`	Boolean	This field can only be used when `kind = CLASSIC_PREVIEW`. When set to true, Databricks will automatically set single node related `custom_tags`, `spark_conf`, and `num_workers`
`kind`	String	The kind of compute described by this compute specification.
`node_type_id`	String	This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.
`num_workers`	Integer	Number of worker nodes that this cluster should have. A cluster has one Spark Driver and `num_workers` Executors for a total of `num_workers` + 1 Spark nodes.
`permissions`	Sequence	The cluster permissions. See permissions.
`policy_id`	String	The ID of the cluster policy used to create the cluster if applicable.
`runtime_engine`	String	Determines the cluster's runtime engine, either `STANDARD` or `PHOTON`.
`single_user_name`	String	Single user name if data_security_mode is `SINGLE_USER`
`spark_conf`	Map	An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via `spark.driver.extraJavaOptions` and `spark.executor.extraJavaOptions` respectively. See spark_conf.
`spark_env_vars`	Map	An object containing a set of optional, user-specified environment variable key-value pairs.
`spark_version`	String	The Spark version of the cluster, e.g. `3.3.x-scala2.11`. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.
`ssh_public_keys`	Sequence	SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name `ubuntu` on port `2200`. Up to 10 keys can be specified.
`use_ml_runtime`	Boolean	This field can only be used when `kind = CLASSIC_PREVIEW`. `effective_spark_version` is determined by `spark_version` (DBR release), this field `use_ml_runtime`, and whether `node_type_id` is gpu node or not.
`workload_type`	Map	Cluster Attributes showing for clusters workload types. See workload_type.

Examples

The following example creates a dedicated (single-user) cluster for the current user with Databricks Runtime 15.4 LTS and a cluster policy:

resources:
  clusters:
    my_cluster:
      num_workers: 0
      node_type_id: 'i3.xlarge'
      driver_node_type_id: 'i3.xlarge'
      spark_version: '15.4.x-scala2.12'
      spark_conf:
        'spark.executor.memory': '2g'
      autotermination_minutes: 60
      enable_elastic_disk: true
      single_user_name: ${workspace.current_user.userName}
      policy_id: '000128DB309672CA'
      enable_local_disk_encryption: false
      data_security_mode: SINGLE_USER
      runtime_engine": STANDARD

This example creates a simple cluster my_cluster and sets that as the cluster to use to run the notebook in my_job:

bundle:
  name: clusters

resources:
  clusters:
    my_cluster:
      num_workers: 2
      node_type_id: 'i3.xlarge'
      autoscale:
        min_workers: 2
        max_workers: 7
      spark_version: '13.3.x-scala2.12'
      spark_conf:
        'spark.executor.memory': '2g'

  jobs:
    my_job:
      tasks:
        - task_key: test_task
          notebook_task:
            notebook_path: './src/my_notebook.py'
          existing_cluster_id: ${resources.clusters.my_cluster.id}

dashboard

Type: Map

The dashboard resource allows you to manage AI/BI dashboards in a bundle. For information about AI/BI dashboards, see Dashboards.

Note

When using Databricks Asset Bundles with dashboard Git support, prevent duplicate dashboards from being generated by adding the sync mapping to exclude the dashboards from synchronizing as files:

sync:
  exclude:
    - src/*.lvdash.json

dashboards:
  <dashboard-name>:
    <dashboard-field-name>: <dashboard-field-value>

Key	Type	Description
`display_name`	String	The display name of the dashboard.
`embed_credentials`	Boolean	Whether the bundle deployment identity credentials are used to execute queries for all dashboard viewers. If it is set to `false`, a viewer's credentials are used. The default value is `false`.
`etag`	String	The etag for the dashboard. Can be optionally provided on updates to ensure that the dashboard has not been modified since the last read.
`file_path`	String	The local path of the dashboard asset, including the file name. Exported dashboards always have the file extension `.lvdash.json`.
`permissions`	Sequence	The dashboard permissions. See permissions.
`serialized_dashboard`	Any	The contents of the dashboard in serialized string form.
`warehouse_id`	String	The warehouse ID used to run the dashboard.

Example

The following example includes and deploys the sample NYC Taxi Trip Analysis dashboard to the Databricks workspace.

resources:
  dashboards:
    nyc_taxi_trip_analysis:
      display_name: 'NYC Taxi Trip Analysis'
      file_path: ../src/nyc_taxi_trip_analysis.lvdash.json
      warehouse_id: ${var.warehouse_id}

If you use the UI to modify the dashboard, modifications made through the UI are not applied to the dashboard JSON file in the local bundle unless you explicitly update it using bundle generate. You can use the --watch option to continuously poll and retrieve changes to the dashboard. See databricks bundle generate.

In addition, if you attempt to deploy a bundle that contains a dashboard JSON file that is different than the one in the remote workspace, an error will occur. To force the deploy and overwrite the dashboard in the remote workspace with the local one, use the --force option. See databricks bundle deploy.

database_catalog

Type: Map

The database catalog resource allows you to define database catalogs that correspond to database instances in a bundle. A database catalog is a Lakebase database that is registered as a Unity Catalog catalog.

For information about database catalogs, see Create a catalog.

database_catalogs:
  <database_catalog-name>:
    <database_catalog-field-name>: <database_catalog-field-value>

Key	Type	Description
`create_database_if_not_exists`	Boolean	Whether to create the database if it does not exist.
`database_instance_name`	String	The name of the instance housing the database.
`database_name`	String	The name of the database (in a instance) associated with the catalog.
`name`	String	The name of the catalog in Unity Catalog.

Example

The following example defines a database instance with a corresponding database catalog:

resources:
  database_instances:
    my_instance:
      name: my-instance
      capacity: CU_1
  database_catalogs:
    my_catalog:
      database_instance_name: ${resources.database_instances.my_instance.name}
      name: example_catalog
      database_name: my_database
      create_database_if_not_exists: true

database_instance

Type: Map

The database instance resource allows you to define database instances in a bundle. A Lakebase database instance manages storage and compute resources and provides the endpoints that users connect to.

Important

When you deploy a bundle with a database instance, the instance immediately starts running and is subject to pricing. See Lakebase pricing.

For information about database instances, see What is a database instance?.

database_instances:
  <database_instance-name>:
    <database_instance-field-name>: <database_instance-field-value>

Key	Type	Description
`capacity`	String	The sku of the instance. Valid values are `CU_1`, `CU_2`, `CU_4`, `CU_8`.
`enable_pg_native_login`	Boolean	Whether the instance has PG native password login enabled. Defaults to `true`.
`enable_readable_secondaries`	Boolean	Whether to enable secondaries to serve read-only traffic. Defaults to `false`.
`name`	String	The name of the instance. This is the unique identifier for the instance.
`node_count`	Integer	The number of nodes in the instance, composed of 1 primary and 0 or more secondaries. Defaults to 1 primary and 0 secondaries.
`parent_instance_ref`	Map	The ref of the parent instance. This is only available if the instance is child instance. Input: For specifying the parent instance to create a child instance. See parent instance.
`permissions`	Sequence	The database instance's permissions. See permissions.
`retention_window_in_days`	Integer	The retention window for the instance. This is the time window in days for which the historical data is retained. The default value is 7 days. Valid values are 2 to 35 days.
`stopped`	Boolean	Whether the instance is stopped.

Example

The following example defines a database instance with a corresponding database catalog:

resources:
  database_instances:
    my_instance:
      name: my-instance
      capacity: CU_1
  database_catalogs:
    my_catalog:
      database_instance_name: ${resources.database_instances.my_instance.name}
      name: example_catalog
      database_name: my_database
      create_database_if_not_exists: true

For an example bundle that demonstrates how to define a database instance and corresponding database catalog, see the bundle-examples GitHub repository.

experiment

Type: Map

The experiment resource allows you to define MLflow experiments in a bundle. For information about MLflow experiments, see Organize training runs with MLflow experiments.

experiments:
  <experiment-name>:
    <experiment-field-name>: <experiment-field-value>

Key	Type	Description
`artifact_location`	String	The ___location where artifacts for the experiment are stored.
`name`	String	The friendly name that identifies the experiment. An experiment name must be an absolute path in the Databricks workspace, for example `/Workspace/Users/someone@example.com/my_experiment`.
`permissions`	Sequence	The experiment's permissions. See permissions.
`tags`	Sequence	Additional metadata key-value pairs. See tags.

Example

The following example defines an experiment that all users can view:

resources:
  experiments:
    experiment:
      name: /Workspace/Users/someone@example.com/my_experiment
      permissions:
        - level: CAN_READ
          group_name: users
      description: MLflow experiment used to track runs

job

Type: Map

The job resource allows you to define jobs and their corresponding tasks in your bundle. For information about jobs, see Lakeflow Jobs. For a tutorial that uses a Databricks Asset Bundles template to create a job, see Develop a job with Databricks Asset Bundles.

jobs:
  <job-name>:
    <job-field-name>: <job-field-value>

Key	Type	Description
`budget_policy_id`	String	The id of the user-specified budget policy to use for this job. If not specified, a default budget policy may be applied when creating or modifying the job. See `effective_budget_policy_id` for the budget policy used by this workload.
`continuous`	Map	An optional continuous property for this job. The continuous property will ensure that there is always one run executing. Only one of `schedule` and `continuous` can be used. See continuous.
`deployment`	Map	Deployment information for jobs managed by external sources. See deployment.
`description`	String	An optional description for the job. The maximum length is 27700 characters in UTF-8 encoding.
`edit_mode`	String	Edit mode of the job, either `UI_LOCKED` or `EDITABLE`.
`email_notifications`	Map	An optional set of email addresses that is notified when runs of this job begin or complete as well as when this job is deleted. See email_notifications.
`environments`	Sequence	A list of task execution environment specifications that can be referenced by serverless tasks of this job. An environment is required to be present for serverless tasks. For serverless notebook tasks, the environment is accessible in the notebook environment panel. For other serverless tasks, the task environment is required to be specified using environment_key in the task settings. See environments.
`format`	String	The format of the job.
`git_source`	Map	An optional specification for a remote Git repository containing the source code used by tasks. Important: The `git_source` field and task `source` field set to `GIT` are not recommended for bundles, because local relative paths may not point to the same content in the Git repository, and bundles expect that a deployed job has the same content as the local copy from where it was deployed. Instead, clone the repository locally and set up your bundle project within this repository, so that the source for tasks are the workspace.
`health`	Map	An optional set of health rules that can be defined for this job. See health.
`job_clusters`	Sequence	A list of job cluster specifications that can be shared and reused by tasks of this job. See clusters.
`max_concurrent_runs`	Integer	An optional maximum allowed number of concurrent runs of the job. Set this value if you want to be able to execute multiple runs of the same job concurrently. See max_concurrent_runs.
`name`	String	An optional name for the job. The maximum length is 4096 bytes in UTF-8 encoding.
`notification_settings`	Map	Optional notification settings that are used when sending notifications to each of the `email_notifications` and `webhook_notifications` for this job. See notification_settings.
`parameters`	Sequence	Job-level parameter definitions. See parameters.
`performance_target`	String	PerformanceTarget defines how performant or cost efficient the execution of run on serverless should be.
`permissions`	Sequence	The job's permissions. See permissions.
`queue`	Map	The queue settings of the job. See queue.
`run_as`	Map	Write-only setting. Specifies the user or service principal that the job runs as. If not specified, the job runs as the user who created the job. Either `user_name` or `service_principal_name` should be specified. If not, an error is thrown. See Specify a run identity for a Databricks Asset Bundles workflow.
`schedule`	Map	An optional periodic schedule for this job. The default behavior is that the job only runs when triggered by clicking “Run Now” in the Jobs UI or sending an API request to `runNow`. See schedule.
`tags`	Map	A map of tags associated with the job. These are forwarded to the cluster as cluster tags for jobs clusters, and are subject to the same limitations as cluster tags. A maximum of 25 tags can be added to the job.
`tasks`	Sequence	A list of task specifications to be executed by this job. See Add tasks to jobs in Databricks Asset Bundles.
`timeout_seconds`	Integer	An optional timeout applied to each run of this job. A value of `0` means no timeout.
`trigger`	Map	A configuration to trigger a run when certain conditions are met. See trigger.
`webhook_notifications`	Map	A collection of system notification IDs to notify when runs of this job begin or complete. See webhook_notifications.

Examples

The following example defines a job with the resource key hello-job with one notebook task:

resources:
  jobs:
    hello-job:
      name: hello-job
      tasks:
        - task_key: hello-task
          notebook_task:
            notebook_path: ./hello.py

The following example defines a job with a SQL notebook:

resources:
  jobs:
    job_with_sql_notebook:
      name: 'Job to demonstrate using a SQL notebook with a SQL warehouse'
      tasks:
        - task_key: notebook
          notebook_task:
            notebook_path: ./select.sql
            warehouse_id: 799f096837fzzzz4

For additional job configuration examples, see Job configuration.

For information about defining job tasks and overriding job settings, see:

model (legacy)

Type: Map

The model resource allows you to define legacy models in bundles. Databricks recommends you use Unity Catalog registered models instead.

model_serving_endpoint

Type: Map

The model_serving_endpoint resource allows you to define model serving endpoints. See Manage model serving endpoints.

model_serving_endpoints:
  <model_serving_endpoint-name>:
    <model_serving_endpoint-field-name>: <model_serving_endpoint-field-value>

Key	Type	Description
`ai_gateway`	Map	The AI Gateway configuration for the serving endpoint. NOTE: Only external model and provisioned throughput endpoints are currently supported. See ai_gateway.
`config`	Map	The core config of the serving endpoint. See config.
`name`	String	The name of the serving endpoint. This field is required and must be unique across a Databricks workspace. An endpoint name can consist of alphanumeric characters, dashes, and underscores.
`permissions`	Sequence	The model serving endpoint's permissions. See permissions.
`rate_limits`	Sequence	Deprecated. Rate limits to be applied to the serving endpoint. Use AI Gateway to manage rate limits.
`route_optimized`	Boolean	Enable route optimization for the serving endpoint.
`tags`	Sequence	Tags to be attached to the serving endpoint and automatically propagated to billing logs. See tags.

Example

The following example defines a Unity Catalog model serving endpoint:

resources:
  model_serving_endpoints:
    uc_model_serving_endpoint:
      name: 'uc-model-endpoint'
      config:
        served_entities:
          - entity_name: 'myCatalog.mySchema.my-ads-model'
            entity_version: '10'
            workload_size: 'Small'
            scale_to_zero_enabled: 'true'
        traffic_config:
          routes:
            - served_model_name: 'my-ads-model-10'
              traffic_percentage: '100'
      tags:
        - key: 'team'
          value: 'data science'

pipeline

Type: Map

The pipeline resource allows you to create Lakeflow Declarative Pipelines pipelines. For information about pipelines, see Lakeflow Declarative Pipelines. For a tutorial that uses the Databricks Asset Bundles template to create a pipeline, see Develop Lakeflow Declarative Pipelines with Databricks Asset Bundles.

pipelines:
  <pipeline-name>:
    <pipeline-field-name>: <pipeline-field-value>

Key	Type	Description
`allow_duplicate_names`	Boolean	If false, deployment will fail if name conflicts with that of another pipeline.
`catalog`	String	A catalog in Unity Catalog to publish data from this pipeline to. If `target` is specified, tables in this pipeline are published to a `target` schema inside `catalog` (for example, `catalog`.`target`.`table`). If `target` is not specified, no data is published to Unity Catalog.
`channel`	String	The Lakeflow Declarative Pipelines Release Channel that specifies which version of Lakeflow Declarative Pipelines to use.
`clusters`	Sequence	The cluster settings for this pipeline deployment. See cluster.
`configuration`	Map	The configuration for this pipeline execution.
`continuous`	Boolean	Whether the pipeline is continuous or triggered. This replaces `trigger`.
`deployment`	Map	Deployment type of this pipeline. See deployment.
`development`	Boolean	Whether the pipeline is in development mode. Defaults to false.
`dry_run`	Boolean	Whether the pipeline is a dry run pipeline.
`edition`	String	The pipeline product edition.
`environment`	Map	The environment specification for this pipeline used to install dependencies on serverless compute. This key is only supported in Databricks CLI version 0.258 and above.
`event_log`	Map	The event log configuration for this pipeline. See event_log.
`filters`	Map	The filters that determine which pipeline packages to include in the deployed graph. See filters.
`id`	String	Unique identifier for this pipeline.
`ingestion_definition`	Map	The configuration for a managed ingestion pipeline. These settings cannot be used with the `libraries`, `schema`, `target`, or `catalog` settings. See ingestion_definition.
`libraries`	Sequence	Libraries or code needed by this deployment. See libraries.
`name`	String	A friendly name for this pipeline.
`notifications`	Sequence	The notification settings for this pipeline. See notifications.
`permissions`	Sequence	The pipeline's permissions. See permissions.
`photon`	Boolean	Whether Photon is enabled for this pipeline.
`root_path`	String	The root path for this pipeline. This is used as the root directory when editing the pipeline in the Databricks user interface and it is added to sys.path when executing Python sources during pipeline execution.
`run_as`	Map	The identity that the pipeline runs as. If not specified, the pipeline runs as the user who created the pipeline. Only `user_name` or `service_principal_name` can be specified. If both are specified, an error is thrown. See Specify a run identity for a Databricks Asset Bundles workflow.
`schema`	String	The default schema (database) where tables are read from or published to.
`serverless`	Boolean	Whether serverless compute is enabled for this pipeline.
`storage`	String	The DBFS root directory for storing checkpoints and tables.
`tags`	Map	A map of tags associated with the pipeline. These are forwarded to the cluster as cluster tags, and are therefore subject to the same limitations. A maximum of 25 tags can be added to the pipeline.
`target`	String	Target schema (database) to add tables in this pipeline to. Exactly one of `schema` or `target` must be specified. To publish to Unity Catalog, also specify `catalog`. This legacy field is deprecated for pipeline creation in favor of the `schema` field.

Example

The following example defines a pipeline with the resource key hello-pipeline:

resources:
  pipelines:
    hello-pipeline:
      name: hello-pipeline
      clusters:
        - label: default
          num_workers: 1
      development: true
      continuous: false
      channel: CURRENT
      edition: CORE
      photon: false
      libraries:
        - notebook:
            path: ./pipeline.py

For additional pipeline configuration examples, see Pipeline configuration.

quality_monitor (Unity Catalog)

Type: Map

The quality_monitor resource allows you to define a Unity Catalog table monitor. For information about monitors, see Introduction to Databricks Lakehouse Monitoring.

quality_monitors:
  <quality_monitor-name>:
    <quality_monitor-field-name>: <quality_monitor-field-value>

Key	Type	Description
`assets_dir`	String	The directory to store monitoring assets (e.g. dashboard, metric tables).
`baseline_table_name`	String	Name of the baseline table from which drift metrics are computed from. Columns in the monitored table should also be present in the baseline table.
`custom_metrics`	Sequence	Custom metrics to compute on the monitored table. These can be aggregate metrics, derived metrics (from already computed aggregate metrics), or drift metrics (comparing metrics across time windows). See custom_metrics.
`inference_log`	Map	Configuration for monitoring inference logs. See inference_log.
`notifications`	Map	The notification settings for the monitor. See notifications.
`output_schema_name`	String	Schema where output metric tables are created.
`schedule`	Map	The schedule for automatically updating and refreshing metric tables. See schedule.
`skip_builtin_dashboard`	Boolean	Whether to skip creating a default dashboard summarizing data quality metrics.
`slicing_exprs`	Sequence	List of column expressions to slice data with for targeted analysis. The data is grouped by each expression independently, resulting in a separate slice for each predicate and its complements. For high-cardinality columns, only the top 100 unique values by frequency will generate slices.
`snapshot`	Map	Configuration for monitoring snapshot tables.
`table_name`	String	The full name of the table.
`time_series`	Map	Configuration for monitoring time series tables. See time_series.
`warehouse_id`	String	Optional argument to specify the warehouse for dashboard creation. If not specified, the first running warehouse will be used.

Examples

For a complete example bundle that defines a quality_monitor, see the mlops_demo bundle.

The following examples define quality monitors for InferenceLog, TimeSeries, and Snapshot profile types.

# InferenceLog profile type
resources:
  quality_monitors:
    my_quality_monitor:
      table_name: dev.mlops_schema.predictions
      output_schema_name: ${bundle.target}.mlops_schema
      assets_dir: /Workspace/Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
      inference_log:
        granularities: [1 day]
        model_id_col: model_id
        prediction_col: prediction
        label_col: price
        problem_type: PROBLEM_TYPE_REGRESSION
        timestamp_col: timestamp
      schedule:
        quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
        timezone_id: UTC

# TimeSeries profile type
resources:
  quality_monitors:
    my_quality_monitor:
      table_name: dev.mlops_schema.predictions
      output_schema_name: ${bundle.target}.mlops_schema
      assets_dir: /Workspace/Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
      time_series:
        granularities: [30 minutes]
        timestamp_col: timestamp
      schedule:
        quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
        timezone_id: UTC

# Snapshot profile type
resources:
  quality_monitors:
    my_quality_monitor:
      table_name: dev.mlops_schema.predictions
      output_schema_name: ${bundle.target}.mlops_schema
      assets_dir: /Workspace/Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
      snapshot: {}
      schedule:
        quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
        timezone_id: UTC

registered_model (Unity Catalog)

Type: Map

The registered model resource allows you to define models in Unity Catalog. For information about Unity Catalog registered models, see Manage model lifecycle in Unity Catalog.

registered_models:
  <registered_model-name>:
    <registered_model-field-name>: <registered_model-field-value>

Key	Type	Description
`catalog_name`	String	The name of the catalog where the schema and the registered model reside.
`comment`	String	The comment attached to the registered model.
`grants`	Sequence	The grants associated with the registered model. See grant.
`name`	String	The name of the registered model.
`schema_name`	String	The name of the schema where the registered model resides.
`storage_location`	String	The storage ___location on the cloud under which model version data files are stored.

Example

The following example defines a registered model in Unity Catalog:

resources:
  registered_models:
    model:
      name: my_model
      catalog_name: ${bundle.target}
      schema_name: mlops_schema
      comment: Registered model in Unity Catalog for ${bundle.target} deployment target
      grants:
        - privileges:
            - EXECUTE
          principal: account users

schema (Unity Catalog)

Type: Map

The schema resource type allows you to define Unity Catalog schemas for tables and other assets in your workflows and pipelines created as part of a bundle. A schema, different from other resource types, has the following limitations:

The owner of a schema resource is always the deployment user, and cannot be changed. If run_as is specified in the bundle, it will be ignored by operations on the schema.
Only fields supported by the corresponding Schemas object create API are available for the schema resource. For example, enable_predictive_optimization is not supported as it is only available on the update API.

schemas:
  <schema-name>:
    <schema-field-name>: <schema-field-value>

Key	Type	Description
`catalog_name`	String	The name of the parent catalog.
`comment`	String	A user-provided free-form text description.
`grants`	Sequence	The grants associated with the schema. See grant.
`name`	String	The name of schema, relative to the parent catalog.
`properties`	Map	A map of key-value properties attached to the schema.
`storage_root`	String	The storage root URL for managed tables within the schema.

Examples

The following example defines a pipeline with the resource key my_pipeline that creates a Unity Catalog schema with the key my_schema as the target:

resources:
  pipelines:
    my_pipeline:
      name: test-pipeline-{{.unique_id}}
      libraries:
        - notebook:
            path: ../src/nb.ipynb
        - file:
            path: ../src/range.sql
      development: true
      catalog: ${resources.schemas.my_schema.catalog_name}
      target: ${resources.schemas.my_schema.id}

  schemas:
    my_schema:
      name: test-schema-{{.unique_id}}
      catalog_name: main
      comment: This schema was created by Databricks Asset Bundles.

A top-level grants mapping is not supported by Databricks Asset Bundles, so if you want to set grants for a schema, define the grants for the schema within the schemas mapping. For more information about grants, see Show, grant, and revoke privileges.

The following example defines a Unity Catalog schema with grants:

resources:
  schemas:
    my_schema:
      name: test-schema
      grants:
        - principal: users
          privileges:
            - SELECT
        - principal: my_team
          privileges:
            - CAN_MANAGE
      catalog_name: main

secret_scope

Type: Map

The secret_scope resource allows you to define secret scopes in a bundle. For information about secret scopes, see Secret management.

secret_scopes:
  <secret_scope-name>:
    <secret_scope-field-name>: <secret_scope-field-value>

Key	Type	Description
`backend_type`	String	The backend type the scope will be created with. If not specified, this defaults to `DATABRICKS`.
`keyvault_metadata`	Map	The metadata for the secret scope if the `backend_type` is `AZURE_KEYVAULT`.
`name`	String	Scope name requested by the user. Scope names are unique.
`permissions`	Sequence	The permissions to apply to the secret scope. Permissions are managed via secret scope ACLs. See permissions.

Examples

The following example defines a secret scope that uses a key vault backend:

resources:
  secret_scopes:
    secret_scope_azure:
      name: test-secrets-azure-backend
      backend_type: 'AZURE_KEYVAULT'
      keyvault_metadata:
        resource_id: my_azure_keyvault_id
        dns_name: my_azure_keyvault_dns_name

The following example sets a custom ACL using secret scopes and permissions:

resources:
  secret_scopes:
    my_secret_scope:
      name: my_secret_scope
      permissions:
        - user_name: admins
          level: WRITE
        - user_name: users
          level: READ

For an example bundle that demonstrates how to define a secret scope and a job with a task that reads from it in a bundle, see the bundle-examples GitHub repository.

sql_warehouse

Type: Map

The SQL warehouse resource allows you to define a SQL warehouse in a bundle. For information about SQL warehouses, see Data warehousing on Azure Databricks.

sql_warehouses:
  <sql-warehouse-name>:
    <sql-warehouse-field-name>: <sql-warehouse-field-value>

Key	Type	Description
`auto_stop_mins`	Integer	The amount of time in minutes that a SQL warehouse must be idle (for example, no RUNNING queries), before it is automatically stopped. Valid values are 0, which indicates no autostop, or greater than or equal to 10. The default is 120.
`channel`	String	The channel details.
`cluster_size`	String	The size of the clusters allocated for this warehouse. Increasing the size of a Spark cluster allows you to run larger queries on it. If you want to increase the number of concurrent queries, tune max_num_clusters. For supported values, see cluster_size.
`creator_name`	String	The name of the user that created the warehouse.
`enable_photon`	Boolean	Whether the warehouse should use Photon optimized clusters. Defaults to false.
`enable_serverless_compute`	Boolean	Whether the warehouse should use serverless compute.
`instance_profile_arn`	String	Deprecated. Instance profile used to pass IAM role to the cluster,
`max_num_clusters`	Integer	The maximum number of clusters that the autoscaler will create to handle concurrent queries. Values must be less than or equal to 30 and greater than or equal to `min_num_clusters`. Defaults to min_clusters if unset.
`min_num_clusters`	Integer	The minimum number of available clusters that will be maintained for this SQL warehouse. Increasing this will ensure that a larger number of clusters are always running and therefore may reduce the cold start time for new queries. This is similar to reserved vs. revocable cores in a resource manager. Values must be greater than 0 and less than or equal to min(max_num_clusters, 30). Defaults to 1.
`name`	String	The logical name for the cluster. The name must be unique within an org and less than 100 characters.
`spot_instance_policy`	String	Whether to use spot instances. Valid values are `POLICY_UNSPECIFIED`, `COST_OPTIMIZED`, `RELIABILITY_OPTIMIZED`. The default is `COST_OPTIMIZED`.
`tags`	Map	A set of key-value pairs that will be tagged on all resources (e.g., AWS instances and EBS volumes) associated with this SQL warehouse. The number of tags must be less than 45.
`warehouse_type`	String	The warehouse type, `PRO` or `CLASSIC`. If you want to use serverless compute, set this field to `PRO` and also set the field `enable_serverless_compute` to `true`.

Example

The following example defines a SQL warehouse:

resources:
  sql_warehouses:
    my_sql_warehouse:
      name: my_sql_warehouse
      cluster_size: X-Large
      enable_serverless_compute: true
      max_num_clusters: 3
      min_num_clusters: 1
      auto_stop_mins: 60
      warehouse_type: PRO

synced_database_table

Type: Map

The synced database table resource allows you to define Lakebase database tables in a bundle.

For information about synced database tables, see What is a database instance?.

synced_database_tables:
  <synced_database_table-name>:
    <synced_database_table-field-name>: <synced_database_table-field-value>

Key	Type	Description
`database_instance_name`	String	The name of the target database instance. This is required when creating synced database tables in standard catalogs. This is optional when creating synced database tables in registered catalogs.
`logical_database_name`	String	The name of the target Postgres database object (logical database) for this table.
`name`	String	The full name of the table, in the form `catalog.schema.table`.
`spec`	Map	The database table specification. See synced database table specification.

Example

The following example defines a synced database table within a corresponding database catalog:

resources:
  database_instances:
    my_instance:
      name: my-instance
      capacity: CU_1
  database_catalogs:
    my_catalog:
      database_instance_name: my-instance
      database_name: 'my_database'
      name: my_catalog
      create_database_if_not_exists: true
  synced_database_tables:
    my_synced_table:
      name: ${resources.database_catalogs.my_catalog.name}.${resources.database_catalogs.my_catalog.database_name}.my_destination_table
      database_instance_name: ${resources.database_catalogs.my_catalog.database_instance_name}
      logical_database_name: ${resources.database_catalogs.my_catalog.database_name}
      spec:
        source_table_full_name: 'my_source_table'
        scheduling_policy: SNAPSHOT
        primary_key_columns:
          - my_pk_column
        new_pipeline_spec:
          storage_catalog: 'my_delta_catalog'
          storage_schema: 'my_delta_schema'

The following example defines a synced database table inside a standard catalog:

resources:
  synced_database_tables:
    my_synced_table:
      name: 'my_standard_catalog.public.synced_table'
      # database_instance_name is required for synced tables created in standard catalogs.
      database_instance_name: 'my-database-instance'
      # logical_database_name is required for synced tables created in standard catalogs:
      logical_database_name: ${resources.database_catalogs.my_catalog.database_name}
      spec:
        source_table_full_name: 'source_catalog.schema.table'
        scheduling_policy: SNAPSHOT
        primary_key_columns:
          - my_pk_column
        create_database_objects_if_missing: true
        new_pipeline_spec:
          storage_catalog: 'my_delta_catalog'
          storage_schema: 'my_delta_schema'

volume (Unity Catalog)

Type: Map

The volume resource type allows you to define and create Unity Catalog volumes as part of a bundle. When deploying a bundle with a volume defined, note that:

A volume cannot be referenced in the artifact_path for the bundle until it exists in the workspace. Hence, if you want to use Databricks Asset Bundles to create the volume, you must first define the volume in the bundle, deploy it to create the volume, then reference it in the artifact_path in subsequent deployments.
Volumes in the bundle are not prepended with the dev_${workspace.current_user.short_name} prefix when the deployment target has mode: development configured. However, you can manually configure this prefix. See Custom presets.

volumes:
  <volume-name>:
    <volume-field-name>: <volume-field-value>

Key	Type	Description
`catalog_name`	String	The name of the catalog of the schema and volume.
`comment`	String	The comment attached to the volume.
`grants`	Sequence	The grants associated with the volume. See grant.
`name`	String	The name of the volume.
`schema_name`	String	The name of the schema where the volume is.
`storage_location`	String	The storage ___location on the cloud.
`volume_type`	String	The volume type, either `EXTERNAL` or `MANAGED`. An external volume is located in the specified external ___location. A managed volume is located in the default ___location which is specified by the parent schema, or the parent catalog, or the metastore. See Managed versus external volumes.

Example

The following example creates a Unity Catalog volume with the key my_volume_id:

resources:
  volumes:
    my_volume_id:
      catalog_name: main
      name: my_volume
      schema_name: my_schema

For an example bundle that runs a job that writes to a file in Unity Catalog volume, see the bundle-examples GitHub repository.

Common objects

grant

Type: Sequence

Key	Type	Description
`principal`	String	The name of the principal that will be granted privileges.
`privileges`	Sequence	The privileges to grant to the specified entity.

Feedback

Was this page helpful?

Share via

Databricks Asset Bundles resources

Supported resources

app

apps.name.resources

Example

cluster

Examples

dashboard

Example

database_catalog

Example

database_instance

Example

experiment

Example

job

Examples

model (legacy)

model_serving_endpoint

Example

pipeline

Example

quality_monitor (Unity Catalog)

Examples

registered_model (Unity Catalog)

Example

schema (Unity Catalog)

Examples

secret_scope

Examples

sql_warehouse

Example

synced_database_table

Example

volume (Unity Catalog)

Example

Common objects

grant

Feedback

Additional resources