Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
This information applies to Databricks CLI versions 0.205 and above. The Databricks CLI is in Public Preview.
Databricks CLI use is subject to the Databricks License and Databricks Privacy Notice, including any Usage Data provisions.
The clusters
command group within the Databricks CLI allows you to create, start, edit, list, terminate, and delete clusters.
A Databricks cluster is a set of computation resources and configurations on which you run data engineering, data science, and data analytics workloads, such as production ETL pipelines, streaming analytics, ad-hoc analytics, and machine learning. See Connect to all-purpose and jobs compute.
Important
Databricks retains cluster configuration information for terminated clusters for 30 days. To keep an all-purpose cluster configuration even after it has been terminated for more than 30 days, an administrator can pin a cluster to the cluster list.
databricks clusters change-owner
Change the owner of the cluster. You must be an admin and the cluster must be terminated to perform this operation. The service principal application ID can be supplied as an argument to owner_username.
databricks clusters change-owner CLUSTER_ID OWNER_USERNAME [flags]
Arguments
CLUSTER_ID
The cluster ID.
OWNER_USERNAME
New owner of the cluster_id after this RPC.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
databricks clusters create
Create a new cluster. This command acquires new instances from the cloud provider if necessary. This command is asynchronous; the returned cluster_id can be used to poll the cluster status. When this command returns, the cluster will be in a PENDING state. The cluster will be usable once it enters a RUNNING state. Databricks may not be able to acquire some of the requested nodes, due to cloud provider limitations (account limits, spot price, etc.) or transient network issues.
If Databricks acquires at least 85% of the requested on-demand nodes, cluster creation will succeed. Otherwise the cluster will terminate with an informative error message.
Rather than authoring the cluster's JSON definition from scratch, Databricks recommends filling out the create compute UI and then copying the generated JSON definition from the UI.
databricks clusters create SPARK_VERSION [flags]
Arguments
SPARK_VERSION
The Spark version of the cluster, for example, 13.3.x-scala2.12. A list of available Spark versions can be retrieved by using the List available Spark versions API.
Options
--apply-policy-default-values
When set to true, fixed and default values from the policy will be used for fields that are omitted.
--autotermination-minutes int
Automatically terminates the cluster after it is inactive for this time in minutes.
--cluster-name string
Cluster name requested by the user.
--data-security-mode DataSecurityMode
Data security mode decides what data governance model to use when accessing data from a cluster. Supported values: DATA_SECURITY_MODE_AUTO
, DATA_SECURITY_MODE_DEDICATED
, DATA_SECURITY_MODE_STANDARD
, LEGACY_PASSTHROUGH
, LEGACY_SINGLE_USER
, LEGACY_SINGLE_USER_STANDARD
, LEGACY_TABLE_ACL
, NONE
, SINGLE_USER
, USER_ISOLATION
--driver-instance-pool-id string
The optional ID of the instance pool for the driver of the cluster belongs.
--driver-node-type-id string
The node type of the Spark driver.
--enable-elastic-disk
Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space.
--enable-local-disk-encryption
Whether to enable LUKS on cluster VMs' local disks.
--instance-pool-id string
The optional ID of the instance pool to which the cluster belongs.
--is-single-node
This field can only be used when kind = CLASSIC_PREVIEW
.
--json JSON
The inline JSON string or the @path to the JSON file with the request body
--kind Kind
The kind of compute described by this compute specification. Supported values: CLASSIC_PREVIEW
--no-wait
Do not wait to reach RUNNING state
--node-type-id string
This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster.
--num-workers int
Number of worker nodes that this cluster should have.
--policy-id string
The ID of the cluster policy used to create the cluster if applicable.
--runtime-engine RuntimeEngine
Determines the cluster's runtime engine, either standard or Photon. Supported values: NULL
, PHOTON
, STANDARD
--single-user-name string
Single user name if data_security_mode is SINGLE_USER
.
--timeout duration
maximum amount of time to reach RUNNING state (default 20m0s)
--use-ml-runtime
This field can only be used when kind = CLASSIC_PREVIEW
.
databricks clusters delete
Terminate the cluster with the specified ID. The cluster is removed asynchronously. Once the termination has completed, the cluster will be in a TERMINATED
state. If the cluster is already in a TERMINATING
or TERMINATED
state, nothing will happen.
databricks clusters delete CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster to be terminated.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
--no-wait
Do not wait to reach TERMINATED
state
--timeout duration
The maximum amount of time to reach TERMINATED
state (default 20m0s)
databricks clusters edit
Update the configuration of a cluster to match the provided attributes and size. A cluster can be updated if it is in a RUNNING or TERMINATED state.
If a cluster is updated while in a RUNNING state, it will be restarted so that the new attributes can take effect.
If a cluster is updated while in a TERMINATED state, it will remain TERMINATED. The next time it is started using the clusters/start API, the new attributes will take effect. Any attempt to update a cluster in any other state will be rejected with an INVALID_STATE error code.
Clusters created by the Databricks Jobs service cannot be edited.
databricks clusters edit CLUSTER_ID SPARK_VERSION [flags]
Arguments
CLUSTER_ID
ID of the cluster
SPARK_VERSION
The Spark version of the cluster, for example, 13.3.x-scala2.12. A list of available Spark versions can be retrieved by using the List available Spark versions API.
Options
--apply-policy-default-values
Use fixed and default values from the policy for fields that are omitted.
--autotermination-minutes int
Automatically terminate the cluster after it is inactive for this time in minutes.
--cluster-name string
Cluster name requested by the user.
--data-security-mode DataSecurityMode
Data security mode decides what data governance model to use when accessing data from a cluster. Supported values: DATA_SECURITY_MODE_AUTO
, DATA_SECURITY_MODE_DEDICATED``, DATA_SECURITY_MODE_STANDARD
, LEGACY_PASSTHROUGH
, LEGACY_SINGLE_USER
, LEGACY_SINGLE_USER_STANDARD
, LEGACY_TABLE_ACL
, NONE
, SINGLE_USER
, USER_ISOLATION
--driver-instance-pool-id string
The optional ID of the instance pool for the driver of the cluster belongs.
--driver-node-type-id string
The node type of the Spark driver.
--enable-elastic-disk
Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space.
--enable-local-disk-encryption
Whether to enable LUKS on cluster VMs' local disks.
--instance-pool-id string
The optional ID of the instance pool to which the cluster belongs.
--is-single-node
This field can only be used when kind = CLASSIC_PREVIEW
.
--json JSON
The inline JSON string or the @path to the JSON file with the request body
--kind Kind
The kind of compute described by this compute specification. Supported values: CLASSIC_PREVIEW
--no-wait
do not wait to reach RUNNING state
--node-type-id string
This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster.
--num-workers int
Number of worker nodes that this cluster should have.
--policy-id string
The ID of the cluster policy used to create the cluster if applicable.
--runtime-engine RuntimeEngine
Determines the cluster's runtime engine, either standard or Photon. Supported values: NULL
, PHOTON
, STANDARD
--single-user-name string
Single user name if data_security_mode is SINGLE_USER.
--timeout duration
maximum amount of time to reach RUNNING state (default 20m0s)
--use-ml-runtime
This field can only be used when kind = CLASSIC_PREVIEW
.
databricks clusters events
List events about the activity of a cluster. This API is paginated. If there are more events to read, the response includes all the parameters necessary to request the next page of events.
databricks clusters events CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The ID of the cluster to retrieve events about.
Options
--end-time int
The end time in epoch milliseconds.
--json JSON
The inline JSON string or the @path to the JSON file with the request body
--limit int
Deprecated: use page_token in combination with page_size instead.
--offset int
Deprecated: use page_token in combination with page_size instead.
--order GetEventsOrder
The order to list events in. Supported values: ASC
, DESC
--page-size int
The maximum number of events to include in a page of events.
--page-token string
Use next_page_token or prev_page_token returned from the previous request to list the next or previous page of events respectively.
--start-time int
The start time in epoch milliseconds.
databricks clusters get
Gets the information for a cluster given its identifier. Clusters can be described while they are running, or up to 60 days after they are terminated.
databricks clusters get CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster about which to retrieve information.
Options
databricks clusters list
List information about all pinned and active clusters, and all clusters terminated within the last 30 days. Clusters terminated prior to this period are not included.
databricks clusters list [flags]
Arguments
None
Options
--cluster-sources []string
Filter clusters by source
--cluster-states []string
Filter clusters by states
--is-pinned
Filter clusters by pinned status
--page-size int
Use this field to specify the maximum number of results to be returned by the server.
--page-token string
Use next_page_token or prev_page_token returned from the previous request to list the next or previous page of clusters respectively.
--policy-id string
Filter clusters by policy id
databricks clusters list-node-types
List supported Spark node types. These node types can be used to launch a cluster.
databricks clusters list-node-types [flags]
Arguments
None
Options
databricks clusters list-zones
List the availability zones where clusters can be created in (For example, us-west-2a). These zones can be used to launch a cluster.
databricks clusters list-zones [flags]
Arguments
None
Options
databricks clusters permanent-delete
Permanently delete cluster. This cluster is terminated and resources are asynchronously removed.
In addition, users will no longer see permanently deleted clusters in the cluster list, and API users can no longer perform any action on permanently deleted clusters.
databricks clusters permanent-delete CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster to be deleted.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
databricks clusters pin
Pin a cluster to ensure that the cluster will always be returned by the ListClusters API. Pinning a cluster that is already pinned will have no effect. This API can only be called by workspace admins.
databricks clusters pin CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster ID.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
databricks clusters resize
Resize cluster to have a desired number of workers. This will fail unless the cluster is in a RUNNING state.
databricks clusters resize CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster to be resized.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
--no-wait
Do not wait to reach RUNNING state
--num-workers int
Number of worker nodes that this cluster should have.
--timeout duration
The maximum amount of time to reach RUNNING state (default 20m0s)
databricks clusters restart
Restart a cluster with the specified ID. If the cluster is not currently in a RUNNING state, nothing will happen.
databricks clusters restart CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster to be started.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
--no-wait
Do not wait to reach RUNNING state
--restart-user string
User who restarted the cluster.
--timeout duration
The maximum amount of time to reach RUNNING state (default 20m0s)
databricks clusters spark-versions
List the available Spark versions. These versions can be used to launch a cluster.
databricks clusters spark-versions [flags]
Arguments
None
Options
databricks clusters start
Start a terminated cluster with the specified ID. This works similar to createCluster except: - The previous cluster id and attributes are preserved. - The cluster starts with the last specified cluster size. - If the previous cluster was an autoscaling cluster, the current cluster starts with the minimum number of nodes. - If the cluster is not currently in a TERMINATED state, nothing will happen. - Clusters launched to run a job cannot be started.
databricks clusters start CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster to be started.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
--no-wait
Do not wait to reach RUNNING state
--timeout duration
The maximum amount of time to reach RUNNING state (default 20m0s)
databricks clusters unpin
Unpin a cluster to allow the cluster to eventually be removed from the ListClusters API. Unpinning a cluster that is not pinned will have no effect. This API can only be called by workspace admins.
databricks clusters unpin CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster ID.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
databricks clusters update
Update the configuration of a cluster to match the partial set of attributes and size. Denote which fields to update using the update_mask field in the request body. A cluster can be updated if it is in a RUNNING or TERMINATED state. If a cluster is updated while in a RUNNING state, it will be restarted so that the new attributes can take effect. If a cluster is updated while in a TERMINATED state, it will remain TERMINATED. The updated attributes will take effect the next time the cluster is started using the clusters start API. Attempts to update a cluster in any other state will be rejected with an INVALID_STATE error code. Clusters created by the Databricks Jobs service cannot be updated.
databricks clusters update CLUSTER_ID UPDATE_MASK [flags]
Arguments
CLUSTER_ID
ID of the cluster.
UPDATE_MASK
Used to specify which cluster attributes and size fields to update. See https://google.aip.dev/161 for more details. The field mask must be a single string, with multiple fields separated by commas (no spaces). The field path is relative to the resource object, using a dot (.) to navigate sub-fields (for example, author.given_name
). Specification of elements in sequence or map fields is not allowed, as only the entire collection field can be specified. Field names must exactly match the resource field names. A field mask of _
indicates full replacement. It's recommended to always explicitly list the fields being updated and avoid using _
wildcards, as it can lead to unintended results if the API changes in the future.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
--no-wait
Do not wait to reach RUNNING state
--timeout duration
The maximum amount of time to reach RUNNING state (default 20m0s)
databricks clusters get-permission-levels
Get cluster permission levels.
databricks clusters get-permission-levels CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster for which to get or manage permissions.
Options
databricks clusters get-permissions
Get cluster permissions. Clusters can inherit permissions from their root object.
databricks clusters get-permissions CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster for which to get or manage permissions.
Options
databricks clusters set-permissions
Set cluster permissions, replacing existing permissions if they exist. Deletes all direct permissions if none are specified. Objects can inherit permissions from their root object.
databricks clusters set-permissions CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster for which to get or manage permissions.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
databricks clusters update-permissions
Update the permissions on a cluster. Clusters can inherit permissions from their root object.
databricks clusters update-permissions CLUSTER_ID [flags]
Arguments
CLUSTER_ID
The cluster for which to get or manage permissions.
Options
--json JSON
The inline JSON string or the @path to the JSON file with the request body
Global flags
--debug
Whether to enable debug logging.
-h
or --help
Display help for the Databricks CLI or the related command group or the related command.
--log-file
string
A string representing the file to write output logs to. If this flag is not specified then the default is to write output logs to stderr.
--log-format
format
The log format type, text
or json
. The default value is text
.
--log-level
string
A string representing the log format level. If not specified then the log format level is disabled.
-o, --output
type
The command output type, text
or json
. The default value is text
.
-p, --profile
string
The name of the profile in the ~/.databrickscfg
file to use to run the command. If this flag is not specified then if it exists, the profile named DEFAULT
is used.
--progress-format
format
The format to display progress logs: default
, append
, inplace
, or json
-t, --target
string
If applicable, the bundle target to use