Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This page describes how to override or join top-level settings with target settings in Databricks Asset Bundles. For information about bundle settings, see Databricks Asset Bundle configuration.
Artifact settings override
You can override the artifact settings in a top-level artifacts
mapping with the artifact settings in a targets
mapping, for example:
# ...
artifacts:
<some-unique-programmatic-identifier-for-this-artifact>:
# Artifact settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
artifacts:
<the-matching-programmatic-identifier-for-this-artifact>:
# Any more artifact settings to join with the settings from the
# matching top-level artifacts mapping.
If any artifact setting is defined both in the top-level artifacts
mapping and the targets
mapping for the same artifact, then the setting in the targets
mapping takes precedence over the setting in the top-level artifacts
mapping.
Example 1: Artifact settings defined only in the top-level artifacts mapping
To demonstrate how this works in practice, in the following example, path
is defined in the top-level artifacts
mapping, which defines all of the settings for the artifact:
# ...
artifacts:
my-artifact:
type: whl
path: ./my_package
# ...
When you run databricks bundle validate
for this example, the resulting graph is:
{
"...": "...",
"artifacts": {
"my-artifact": {
"type": "whl",
"path": "./my_package",
"...": "..."
}
},
"...": "..."
}
Example 2: Conflicting artifact settings defined in multiple artifact mappings
In this example, path
is defined both in the top-level artifacts
mapping and in the artifacts
mapping in targets
. In this example, path
in the artifacts
mapping in targets
takes precedence over path
in the top-level artifacts
mapping, to define the settings for the artifact:
# ...
artifacts:
my-artifact:
type: whl
path: ./my_package
targets:
dev:
artifacts:
my-artifact:
path: ./my_other_package
# ...
When you run databricks bundle validate
for this example, the resulting graph is:
{
"...": "...",
"artifacts": {
"my-artifact": {
"type": "whl",
"path": "./my_other_package",
"...": "..."
}
},
"...": "..."
}
Cluster settings overrides
You can override or join the job or pipeline cluster settings for a target.
For jobs, use job_cluster_key
within a job definition to identify job cluster settings in the top-level resources
mapping to join with job cluster settings in a targets
mapping:
# ...
resources:
jobs:
<some-unique-programmatic-identifier-for-this-job>:
# ...
job_clusters:
- job_cluster_key: <some-unique-programmatic-identifier-for-this-key>
new_cluster:
# Cluster settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
resources:
jobs:
<the-matching-programmatic-identifier-for-this-job>:
# ...
job_clusters:
- job_cluster_key: <the-matching-programmatic-identifier-for-this-key>
# Any more cluster settings to join with the settings from the
# resources mapping for the matching top-level job_cluster_key.
# ...
If any cluster setting is defined both in the top-level resources
mapping and the targets
mapping for the same job_cluster_key
, then the setting in the targets
mapping takes precedence over the setting in the top-level resources
mapping.
For Lakeflow Declarative Pipelines, use label
within the cluster settings of a pipeline definition to identify cluster settings in a top-level resources
mapping to join with the cluster settings in a targets
mapping, for example:
# ...
resources:
pipelines:
<some-unique-programmatic-identifier-for-this-pipeline>:
# ...
clusters:
- label: default | maintenance
# Cluster settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
resources:
pipelines:
<the-matching-programmatic-identifier-for-this-pipeline>:
# ...
clusters:
- label: default | maintenance
# Any more cluster settings to join with the settings from the
# resources mapping for the matching top-level label.
# ...
If any cluster setting is defined both in the top-level resources
mapping and the targets
mapping for the same label
, then the setting in the targets
mapping takes precedence over the setting in the top-level resources
mapping.
Example 1: New job cluster settings defined in multiple resource mappings and with no settings conflicts
In this example, spark_version
in the top-level resources
mapping is combined with node_type_id
and num_workers
in the resources
mapping in targets
to define the settings for the job_cluster_key
named my-cluster
:
# ...
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
spark_version: 13.3.x-scala2.12
targets:
development:
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
node_type_id: Standard_DS3_v2
num_workers: 1
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"job_clusters": [
{
"job_cluster_key": "my-cluster",
"new_cluster": {
"node_type_id": "Standard_DS3_v2",
"num_workers": 1,
"spark_version": "13.3.x-scala2.12"
}
}
],
"...": "..."
}
}
}
}
Example 2: Conflicting new job cluster settings defined in multiple resource mappings
In this example, spark_version
, and num_workers
are defined both in the top-level resources
mapping and in the resources
mapping in targets
. In this example, spark_version
and num_workers
in the resources
mapping in targets
take precedence over spark_version
and num_workers
in the top-level resources
mapping, to define the settings for the job_cluster_key
named my-cluster
:
# ...
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
spark_version: 13.3.x-scala2.12
node_type_id: Standard_DS3_v2
num_workers: 1
targets:
development:
resources:
jobs:
my-job:
name: my-job
job_clusters:
- job_cluster_key: my-cluster
new_cluster:
spark_version: 12.2.x-scala2.12
num_workers: 2
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"job_clusters": [
{
"job_cluster_key": "my-cluster",
"new_cluster": {
"node_type_id": "Standard_DS3_v2",
"num_workers": 2,
"spark_version": "12.2.x-scala2.12"
}
}
],
"...": "..."
}
}
}
}
Example 3: Pipeline cluster settings defined in multiple resource mappings and with no settings conflicts
In this example, node_type_id
in the top-level resources
mapping is combined with num_workers
in the resources
mapping in targets
to define the settings for the label
named default
:
# ...
resources:
pipelines:
my-pipeline:
clusters:
- label: default
node_type_id: Standard_DS3_v2
targets:
development:
resources:
pipelines:
my-pipeline:
clusters:
- label: default
num_workers: 1
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"pipelines": {
"my-pipeline": {
"clusters": [
{
"label": "default",
"node_type_id": "Standard_DS3_v2",
"num_workers": 1
}
],
"...": "..."
}
}
}
}
Example 4: Conflicting pipeline cluster settings defined in multiple resource mappings
In this example, num_workers
is defined both in the top-level resources
mapping and in the resources
mapping in targets
. num_workers
in the resources
mapping in targets
take precedence over num_workers
in the top-level resources
mapping, to define the settings for the label
named default
:
# ...
resources:
pipelines:
my-pipeline:
clusters:
- label: default
node_type_id: Standard_DS3_v2
num_workers: 1
targets:
development:
resources:
pipelines:
my-pipeline:
clusters:
- label: default
num_workers: 2
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"pipelines": {
"my-pipeline": {
"clusters": [
{
"label": "default",
"node_type_id": "Standard_DS3_v2",
"num_workers": 2
}
],
"...": "..."
}
}
}
}
Job task settings override
You can use the tasks
mapping within a job definition to join the job tasks settings in a top-level resources
mapping with the job task settings in a targets
mapping, for example:
# ...
resources:
jobs:
<some-unique-programmatic-identifier-for-this-job>:
# ...
tasks:
- task_key: <some-unique-programmatic-identifier-for-this-task>
# Task settings.
targets:
<some-unique-programmatic-identifier-for-this-target>:
resources:
jobs:
<the-matching-programmatic-identifier-for-this-job>:
# ...
tasks:
- task_key: <the-matching-programmatic-identifier-for-this-key>
# Any more task settings to join with the settings from the
# resources mapping for the matching top-level task_key.
# ...
To join the top-level resources
mapping and the targets
mapping for the same task, the task mappings' task_key
must be set to the same value.
If any job task setting is defined both in the top-level resources
mapping and the targets
mapping for the same task
, then the setting in the targets
mapping takes precedence over the setting in the top-level resources
mapping.
Example 1: Job task settings defined in multiple resource mappings and with no settings conflicts
In this example, spark_version
in the top-level resources
mapping is combined with node_type_id
and num_workers
in the resources
mapping in targets
to define the settings for the task_key
named my-task
:
# ...
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-key
new_cluster:
spark_version: 13.3.x-scala2.12
targets:
development:
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-task
new_cluster:
node_type_id: Standard_DS3_v2
num_workers: 1
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows (ellipses indicate omitted content, for brevity):
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"tasks": [
{
"new_cluster": {
"node_type_id": "Standard_DS3_v2",
"num_workers": 1,
"spark_version": "13.3.x-scala2.12"
},
"task-key": "my-task"
}
],
"...": "..."
}
}
}
}
Example 2: Conflicting job task settings defined in multiple resource mappings
In this example, spark_version
, and num_workers
are defined both in the top-level resources
mapping and in the resources
mapping in targets
. spark_version
and num_workers
in the resources
mapping in targets
take precedence over spark_version
and num_workers
in the top-level resources
mapping. This defines the settings for the task_key
named my-task
(ellipses indicate omitted content, for brevity):
# ...
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-task
new_cluster:
spark_version: 13.3.x-scala2.12
node_type_id: Standard_DS3_v2
num_workers: 1
targets:
development:
resources:
jobs:
my-job:
name: my-job
tasks:
- task_key: my-task
new_cluster:
spark_version: 12.2.x-scala2.12
num_workers: 2
# ...
When you run databricks bundle validate
for this example, the resulting graph is as follows:
{
"...": "...",
"resources": {
"jobs": {
"my-job": {
"tasks": [
{
"new_cluster": {
"node_type_id": "Standard_DS3_v2",
"num_workers": 2,
"spark_version": "12.2.x-scala2.12"
},
"task_key": "my-task"
}
],
"...": "..."
}
}
}
}