Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This page describes how to override or join top-level settings with target settings in Databricks Asset Bundles. For information about bundle settings, see Databricks Asset Bundle configuration.
Artifact settings override
You can override the artifact settings in a top-level artifacts mapping with the artifact settings in a targets mapping, for example:
# ...
artifacts:
<some-unique-programmatic-identifier-for-this-artifact>:
# Artifact settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
artifacts:
<the-matching-programmatic-identifier-for-this-artifact>:
# Any more artifact settings to join with the settings from the
# matching top-level artifacts mapping.
If any artifact setting is defined both in the top-level artifacts mapping and the targets mapping for the same artifact, then the setting in the targets mapping takes precedence over the setting in the top-level artifacts mapping.
Example 1: Artifact settings defined only in the top-level artifacts mapping
To demonstrate how this works in practice, in the following example, path is defined in the top-level artifacts mapping, which defines all of the settings for the artifact:
# ...
artifacts:
my-artifact:
type: whl
path: ./my_package
# ...
When you run databricks bundle validate for this example, the resulting graph is:
{
"...": "...",
"artifacts": {
"my-artifact": {
"type": "whl",
"path": "./my_package",
"...": "..."
}
},
"...": "..."
}
Example 2: Conflicting artifact settings defined in multiple artifact mappings
In this example, path is defined both in the top-level artifacts mapping and in the artifacts mapping in targets. In this example, path in the artifacts mapping in targets takes precedence over path in the top-level artifacts mapping, to define the settings for the artifact:
# ...
artifacts:
my-artifact:
type: whl
path: ./my_package
targets:
dev:
artifacts:
my-artifact:
path: ./my_other_package
# ...
When you run databricks bundle validate for this example, the resulting graph is:
{
"...": "...",
"artifacts": {
"my-artifact": {
"type": "whl",
"path": "./my_other_package",
"...": "..."
}
},
"...": "..."
}
Cluster settings overrides
You can override or join the job or pipeline cluster settings for a target.
For jobs, use job_cluster_key within a job definition to identify job cluster settings in the top-level resources mapping to join with job cluster settings in a targets mapping:
# ...
resources:
jobs:
<some-unique-programmatic-identifier-for-this-job>:
# ...
job_clusters:
- job_cluster_key: <some-unique-programmatic-identifier-for-this-key>
new_cluster:
# Cluster settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
resources:
jobs:
<the-matching-programmatic-identifier-for-this-job>:
# ...
job_clusters:
- job_cluster_key: <the-matching-programmatic-identifier-for-this-key>
# Any more cluster settings to join with the settings from the
# resources mapping for the matching top-level job_cluster_key.
# ...
If any cluster setting is defined both in the top-level resources mapping and the targets mapping for the same job_cluster_key, then the setting in the targets mapping takes precedence over the setting in the top-level resources mapping.
For Lakeflow Declarative Pipelines, use label within the cluster settings of a pipeline definition to identify cluster settings in a top-level resources mapping to join with the cluster settings in a targets mapping, for example:
# ...
resources:
pipelines:
<some-unique-programmatic-identifier-for-this-pipeline>:
# ...
clusters:
- label: default | maintenance
# Cluster settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
resources:
pipelines:
<the-matching-programmatic-identifier-for-this-pipeline>:
# ...
clusters:
- label: default | maintenance
# Any more cluster settings to join with the settings from the
# resources mapping for the matching top-level label.
# ...
If any cluster setting is defined both in the top-level resources mapping and the targets mapping for the same label, then the setting in the targets mapping takes precedence over the setting in the top-level resources mapping.
Example 1: New job cluster settings defined in multiple resource mappings and with no settings conflicts
In this example, spark_version in the top-level resources mapping is combined with node_type_id and num_workers in the resources mapping in targets to define the settings for the job_cluster_key named my-cluster:
# ...
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
spark_version: 13.3.x-scala2.12
targets:
development:
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
node_type_id: Standard_DS3_v2
num_workers: 1
# ...
When you run databricks bundle validate for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"job_clusters": [
{
"job_cluster_key": "my-cluster",
"new_cluster": {
"node_type_id": "Standard_DS3_v2",
"num_workers": 1,
"spark_version": "13.3.x-scala2.12"
}
}
],
"...": "..."
}
}
}
}
Example 2: Conflicting new job cluster settings defined in multiple resource mappings
In this example, spark_version, and num_workers are defined both in the top-level resources mapping and in the resources mapping in targets. In this example, spark_version and num_workers in the resources mapping in targets take precedence over spark_version and num_workers in the top-level resources mapping, to define the settings for the job_cluster_key named my-cluster:
# ...
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
spark_version: 13.3.x-scala2.12
node_type_id: Standard_DS3_v2
num_workers: 1
targets:
development:
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
spark_version: 12.2.x-scala2.12
num_workers: 2
# ...
When you run databricks bundle validate for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"job_clusters": [
{
"job_cluster_key": "my-cluster",
"new_cluster": {
"node_type_id": "Standard_DS3_v2",
"num_workers": 2,
"spark_version": "12.2.x-scala2.12"
}
}
],
"...": "..."
}
}
}
}
Example 3: Pipeline cluster settings defined in multiple resource mappings and with no settings conflicts
In this example, node_type_id in the top-level resources mapping is combined with num_workers in the resources mapping in targets to define the settings for the label named default:
# ...
resources:
pipelines:
my-pipeline:
clusters:
- label: default
node_type_id: Standard_DS3_v2
targets:
development:
resources:
pipelines:
my-pipeline:
clusters:
- label: default
num_workers: 1
# ...
When you run databricks bundle validate for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"pipelines": {
"my-pipeline": {
"clusters": [
{
"label": "default",
"node_type_id": "Standard_DS3_v2",
"num_workers": 1
}
],
"...": "..."
}
}
}
}
Example 4: Conflicting pipeline cluster settings defined in multiple resource mappings
In this example, num_workers is defined both in the top-level resources mapping and in the resources mapping in targets. num_workers in the resources mapping in targets take precedence over num_workers in the top-level resources mapping, to define the settings for the label named default:
# ...
resources:
pipelines:
my-pipeline:
clusters:
- label: default
node_type_id: Standard_DS3_v2
num_workers: 1
targets:
development:
resources:
pipelines:
my-pipeline:
clusters:
- label: default
num_workers: 2
# ...
When you run databricks bundle validate for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"pipelines": {
"my-pipeline": {
"clusters": [
{
"label": "default",
"node_type_id": "Standard_DS3_v2",
"num_workers": 2
}
],
"...": "..."
}
}
}
}
Job task settings override
You can use the tasks mapping within a job definition to join the job tasks settings in a top-level resources mapping with the job task settings in a targets mapping, for example:
# ...
resources:
jobs:
<some-unique-programmatic-identifier-for-this-job>:
# ...
tasks:
- task_key: <some-unique-programmatic-identifier-for-this-task>
# Task settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
resources:
jobs:
<the-matching-programmatic-identifier-for-this-job>:
# ...
tasks:
- task_key: <the-matching-programmatic-identifier-for-this-key>
# Any more task settings to join with the settings from the
# resources mapping for the matching top-level task_key.
# ...
To join the top-level resources mapping and the targets mapping for the same task, the task mappings' task_key must be set to the same value.
If any job task setting is defined both in the top-level resources mapping and the targets mapping for the same task, then the setting in the targets mapping takes precedence over the setting in the top-level resources mapping.
Example 1: Job task settings defined in multiple resource mappings and with no settings conflicts
In this example, spark_version in the top-level resources mapping is combined with node_type_id and num_workers in the resources mapping in targets to define the settings for the task_key named my-task:
# ...
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-key
new_cluster:
spark_version: 13.3.x-scala2.12
targets:
development:
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-task
new_cluster:
node_type_id: Standard_DS3_v2
num_workers: 1
# ...
When you run databricks bundle validate for this example, the resulting graph is as follows (ellipses indicate omitted content, for brevity):
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"tasks": [
{
"new_cluster": {
"node_type_id": "Standard_DS3_v2",
"num_workers": 1,
"spark_version": "13.3.x-scala2.12"
},
"task-key": "my-task"
}
],
"...": "..."
}
}
}
}
Example 2: Conflicting job task settings defined in multiple resource mappings
In this example, spark_version, and num_workers are defined both in the top-level resources mapping and in the resources mapping in targets. spark_version and num_workers in the resources mapping in targets take precedence over spark_version and num_workers in the top-level resources mapping. This defines the settings for the task_key named my-task (ellipses indicate omitted content, for brevity):
# ...
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-task
new_cluster:
spark_version: 13.3.x-scala2.12
node_type_id: Standard_DS3_v2
num_workers: 1
targets:
development:
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-task
new_cluster:
spark_version: 12.2.x-scala2.12
num_workers: 2
# ...
When you run databricks bundle validate for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"tasks": [
{
"new_cluster": {
"node_type_id": "Standard_DS3_v2",
"num_workers": 2,
"spark_version": "12.2.x-scala2.12"
},
"task_key": "my-task"
}
],
"...": "..."
}
}
}
}