Share via


clusters command group

Note

This information applies to Databricks CLI versions 0.205 and above. The Databricks CLI is in Public Preview.

Databricks CLI use is subject to the Databricks License and Databricks Privacy Notice, including any Usage Data provisions.

The clusters command group within the Databricks CLI allows you to create, start, edit, list, terminate, and delete clusters.

A Databricks cluster is a set of computation resources and configurations on which you run data engineering, data science, and data analytics workloads, such as production ETL pipelines, streaming analytics, ad-hoc analytics, and machine learning. See Connect to all-purpose and jobs compute.

Important

Databricks retains cluster configuration information for terminated clusters for 30 days. To keep an all-purpose cluster configuration even after it has been terminated for more than 30 days, an administrator can pin a cluster to the cluster list.

databricks clusters change-owner

Change the owner of the cluster. You must be an admin and the cluster must be terminated to perform this operation. The service principal application ID can be supplied as an argument to owner_username.

databricks clusters change-owner CLUSTER_ID OWNER_USERNAME [flags]

Arguments

CLUSTER_ID

    The cluster ID.

OWNER_USERNAME

    New owner of the cluster_id after this RPC.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

Global flags

databricks clusters create

Create a new cluster. This command acquires new instances from the cloud provider if necessary. This command is asynchronous; the returned cluster_id can be used to poll the cluster status. When this command returns, the cluster will be in a PENDING state. The cluster will be usable once it enters a RUNNING state. Databricks may not be able to acquire some of the requested nodes, due to cloud provider limitations (account limits, spot price, etc.) or transient network issues.

If Databricks acquires at least 85% of the requested on-demand nodes, cluster creation will succeed. Otherwise the cluster will terminate with an informative error message.

Rather than authoring the cluster's JSON definition from scratch, Databricks recommends filling out the create compute UI and then copying the generated JSON definition from the UI.

databricks clusters create SPARK_VERSION [flags]

Arguments

SPARK_VERSION

    The Spark version of the cluster, for example, 13.3.x-scala2.12. A list of available Spark versions can be retrieved by using the List available Spark versions API.

Options

--apply-policy-default-values

    When set to true, fixed and default values from the policy will be used for fields that are omitted.

--autotermination-minutes int

    Automatically terminates the cluster after it is inactive for this time in minutes.

--cluster-name string

    Cluster name requested by the user.

--data-security-mode DataSecurityMode

    Data security mode decides what data governance model to use when accessing data from a cluster. Supported values: DATA_SECURITY_MODE_AUTO, DATA_SECURITY_MODE_DEDICATED, DATA_SECURITY_MODE_STANDARD, LEGACY_PASSTHROUGH, LEGACY_SINGLE_USER, LEGACY_SINGLE_USER_STANDARD, LEGACY_TABLE_ACL, NONE, SINGLE_USER, USER_ISOLATION

--driver-instance-pool-id string

    The optional ID of the instance pool for the driver of the cluster belongs.

--driver-node-type-id string

    The node type of the Spark driver.

--enable-elastic-disk

    Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space.

--enable-local-disk-encryption

    Whether to enable LUKS on cluster VMs' local disks.

--instance-pool-id string

    The optional ID of the instance pool to which the cluster belongs.

--is-single-node

    This field can only be used when kind = CLASSIC_PREVIEW.

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

--kind Kind

    The kind of compute described by this compute specification. Supported values: CLASSIC_PREVIEW

--no-wait

    Do not wait to reach RUNNING state

--node-type-id string

    This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster.

--num-workers int

    Number of worker nodes that this cluster should have.

--policy-id string

    The ID of the cluster policy used to create the cluster if applicable.

--runtime-engine RuntimeEngine

    Determines the cluster's runtime engine, either standard or Photon. Supported values: NULL, PHOTON, STANDARD

--single-user-name string

    Single user name if data_security_mode is SINGLE_USER.

--timeout duration

    maximum amount of time to reach RUNNING state (default 20m0s)

--use-ml-runtime

    This field can only be used when kind = CLASSIC_PREVIEW.

Global flags

databricks clusters delete

Terminate the cluster with the specified ID. The cluster is removed asynchronously. Once the termination has completed, the cluster will be in a TERMINATED state. If the cluster is already in a TERMINATING or TERMINATED state, nothing will happen.

databricks clusters delete CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster to be terminated.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

--no-wait

    Do not wait to reach TERMINATED state

--timeout duration

    The maximum amount of time to reach TERMINATED state (default 20m0s)

Global flags

databricks clusters edit

Update the configuration of a cluster to match the provided attributes and size. A cluster can be updated if it is in a RUNNING or TERMINATED state.

If a cluster is updated while in a RUNNING state, it will be restarted so that the new attributes can take effect.

If a cluster is updated while in a TERMINATED state, it will remain TERMINATED. The next time it is started using the clusters/start API, the new attributes will take effect. Any attempt to update a cluster in any other state will be rejected with an INVALID_STATE error code.

Clusters created by the Databricks Jobs service cannot be edited.

databricks clusters edit CLUSTER_ID SPARK_VERSION [flags]

Arguments

CLUSTER_ID

    ID of the cluster

SPARK_VERSION

    The Spark version of the cluster, for example, 13.3.x-scala2.12. A list of available Spark versions can be retrieved by using the List available Spark versions API.

Options

--apply-policy-default-values

    Use fixed and default values from the policy for fields that are omitted.

--autotermination-minutes int

    Automatically terminate the cluster after it is inactive for this time in minutes.

--cluster-name string

    Cluster name requested by the user.

--data-security-mode DataSecurityMode

    Data security mode decides what data governance model to use when accessing data from a cluster. Supported values: DATA_SECURITY_MODE_AUTO, DATA_SECURITY_MODE_DEDICATED``, DATA_SECURITY_MODE_STANDARD, LEGACY_PASSTHROUGH, LEGACY_SINGLE_USER, LEGACY_SINGLE_USER_STANDARD, LEGACY_TABLE_ACL, NONE, SINGLE_USER, USER_ISOLATION

--driver-instance-pool-id string

    The optional ID of the instance pool for the driver of the cluster belongs.

--driver-node-type-id string

    The node type of the Spark driver.

--enable-elastic-disk

    Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space.

--enable-local-disk-encryption

    Whether to enable LUKS on cluster VMs' local disks.

--instance-pool-id string

    The optional ID of the instance pool to which the cluster belongs.

--is-single-node

    This field can only be used when kind = CLASSIC_PREVIEW.

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

--kind Kind

    The kind of compute described by this compute specification. Supported values: CLASSIC_PREVIEW

--no-wait

    do not wait to reach RUNNING state

--node-type-id string

    This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster.

--num-workers int

    Number of worker nodes that this cluster should have.

--policy-id string

    The ID of the cluster policy used to create the cluster if applicable.

--runtime-engine RuntimeEngine

    Determines the cluster's runtime engine, either standard or Photon. Supported values: NULL, PHOTON, STANDARD

--single-user-name string

    Single user name if data_security_mode is SINGLE_USER.

--timeout duration

    maximum amount of time to reach RUNNING state (default 20m0s)

--use-ml-runtime

    This field can only be used when kind = CLASSIC_PREVIEW.

Global flags

databricks clusters events

List events about the activity of a cluster. This API is paginated. If there are more events to read, the response includes all the parameters necessary to request the next page of events.

databricks clusters events CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The ID of the cluster to retrieve events about.

Options

--end-time int

    The end time in epoch milliseconds.

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

--limit int

    Deprecated: use page_token in combination with page_size instead.

--offset int

    Deprecated: use page_token in combination with page_size instead.

--order GetEventsOrder

    The order to list events in. Supported values: ASC, DESC

--page-size int

    The maximum number of events to include in a page of events.

--page-token string

    Use next_page_token or prev_page_token returned from the previous request to list the next or previous page of events respectively.

--start-time int

    The start time in epoch milliseconds.

Global flags

databricks clusters get

Gets the information for a cluster given its identifier. Clusters can be described while they are running, or up to 60 days after they are terminated.

databricks clusters get CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster about which to retrieve information.

Options

Global flags

databricks clusters list

List information about all pinned and active clusters, and all clusters terminated within the last 30 days. Clusters terminated prior to this period are not included.

databricks clusters list [flags]

Arguments

None

Options

--cluster-sources []string

    Filter clusters by source

--cluster-states []string

    Filter clusters by states

--is-pinned

    Filter clusters by pinned status

--page-size int

    Use this field to specify the maximum number of results to be returned by the server.

--page-token string

    Use next_page_token or prev_page_token returned from the previous request to list the next or previous page of clusters respectively.

--policy-id string

    Filter clusters by policy id

Global flags

databricks clusters list-node-types

List supported Spark node types. These node types can be used to launch a cluster.

databricks clusters list-node-types [flags]

Arguments

None

Options

Global flags

databricks clusters list-zones

List the availability zones where clusters can be created in (For example, us-west-2a). These zones can be used to launch a cluster.

databricks clusters list-zones [flags]

Arguments

None

Options

Global flags

databricks clusters permanent-delete

Permanently delete cluster. This cluster is terminated and resources are asynchronously removed.

In addition, users will no longer see permanently deleted clusters in the cluster list, and API users can no longer perform any action on permanently deleted clusters.

databricks clusters permanent-delete CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster to be deleted.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

Global flags

databricks clusters pin

Pin a cluster to ensure that the cluster will always be returned by the ListClusters API. Pinning a cluster that is already pinned will have no effect. This API can only be called by workspace admins.

databricks clusters pin CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster ID.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

Global flags

databricks clusters resize

Resize cluster to have a desired number of workers. This will fail unless the cluster is in a RUNNING state.

databricks clusters resize CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster to be resized.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

--no-wait

    Do not wait to reach RUNNING state

--num-workers int

    Number of worker nodes that this cluster should have.

--timeout duration

    The maximum amount of time to reach RUNNING state (default 20m0s)

Global flags

databricks clusters restart

Restart a cluster with the specified ID. If the cluster is not currently in a RUNNING state, nothing will happen.

databricks clusters restart CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster to be started.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

--no-wait

    Do not wait to reach RUNNING state

--restart-user string

    User who restarted the cluster.

--timeout duration

    The maximum amount of time to reach RUNNING state (default 20m0s)

Global flags

databricks clusters spark-versions

List the available Spark versions. These versions can be used to launch a cluster.

databricks clusters spark-versions [flags]

Arguments

None

Options

Global flags

databricks clusters start

Start a terminated cluster with the specified ID. This works similar to createCluster except: - The previous cluster id and attributes are preserved. - The cluster starts with the last specified cluster size. - If the previous cluster was an autoscaling cluster, the current cluster starts with the minimum number of nodes. - If the cluster is not currently in a TERMINATED state, nothing will happen. - Clusters launched to run a job cannot be started.

databricks clusters start CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster to be started.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

--no-wait

    Do not wait to reach RUNNING state

--timeout duration

    The maximum amount of time to reach RUNNING state (default 20m0s)

Global flags

databricks clusters unpin

Unpin a cluster to allow the cluster to eventually be removed from the ListClusters API. Unpinning a cluster that is not pinned will have no effect. This API can only be called by workspace admins.

databricks clusters unpin CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster ID.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

Global flags

databricks clusters update

Update the configuration of a cluster to match the partial set of attributes and size. Denote which fields to update using the update_mask field in the request body. A cluster can be updated if it is in a RUNNING or TERMINATED state. If a cluster is updated while in a RUNNING state, it will be restarted so that the new attributes can take effect. If a cluster is updated while in a TERMINATED state, it will remain TERMINATED. The updated attributes will take effect the next time the cluster is started using the clusters start API. Attempts to update a cluster in any other state will be rejected with an INVALID_STATE error code. Clusters created by the Databricks Jobs service cannot be updated.

databricks clusters update CLUSTER_ID UPDATE_MASK [flags]

Arguments

CLUSTER_ID

    ID of the cluster.

UPDATE_MASK

    Used to specify which cluster attributes and size fields to update. See https://google.aip.dev/161 for more details. The field mask must be a single string, with multiple fields separated by commas (no spaces). The field path is relative to the resource object, using a dot (.) to navigate sub-fields (for example, author.given_name). Specification of elements in sequence or map fields is not allowed, as only the entire collection field can be specified. Field names must exactly match the resource field names. A field mask of _ indicates full replacement. It's recommended to always explicitly list the fields being updated and avoid using _ wildcards, as it can lead to unintended results if the API changes in the future.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

--no-wait

    Do not wait to reach RUNNING state

--timeout duration

    The maximum amount of time to reach RUNNING state (default 20m0s)

Global flags

databricks clusters get-permission-levels

Get cluster permission levels.

databricks clusters get-permission-levels CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster for which to get or manage permissions.

Options

Global flags

databricks clusters get-permissions

Get cluster permissions. Clusters can inherit permissions from their root object.

databricks clusters get-permissions CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster for which to get or manage permissions.

Options

Global flags

databricks clusters set-permissions

Set cluster permissions, replacing existing permissions if they exist. Deletes all direct permissions if none are specified. Objects can inherit permissions from their root object.

databricks clusters set-permissions CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster for which to get or manage permissions.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

Global flags

databricks clusters update-permissions

Update the permissions on a cluster. Clusters can inherit permissions from their root object.

databricks clusters update-permissions CLUSTER_ID [flags]

Arguments

CLUSTER_ID

    The cluster for which to get or manage permissions.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body

Global flags

Global flags

--debug

  Whether to enable debug logging.

-h or --help

    Display help for the Databricks CLI or the related command group or the related command.

--log-file string

    A string representing the file to write output logs to. If this flag is not specified then the default is to write output logs to stderr.

--log-format format

    The log format type, text or json. The default value is text.

--log-level string

    A string representing the log format level. If not specified then the log format level is disabled.

-o, --output type

    The command output type, text or json. The default value is text.

-p, --profile string

    The name of the profile in the ~/.databrickscfg file to use to run the command. If this flag is not specified then if it exists, the profile named DEFAULT is used.

--progress-format format

    The format to display progress logs: default, append, inplace, or json

-t, --target string

    If applicable, the bundle target to use