Skip to content

Cluster

laktory.models.resources.databricks.Cluster ¤

Bases: ClusterBase

Databricks cluster

Examples:

import io

from laktory import models

cluster_yaml = '''
name: default
spark_version: 16.3.x-scala2.12
data_security_mode: USER_ISOLATION
node_type_id: Standard_DS3_v2
autoscale:
  min_workers: 1
  max_workers: 4
autotermination_minutes: 30
libraries:
- pypi:
    package: laktory==0.0.23
access_controls:
- group_name: role-engineers
  permission_level: CAN_RESTART
is_pinned: true
'''
cluster = models.resources.databricks.Cluster.model_validate_yaml(
    io.StringIO(cluster_yaml)
)
References
BASE DESCRIPTION
apply_policy_default_values

When set to true, fixed and default values from the policy will be used for fields that are omitted. When set to false, only fixed values from the policy will be applied.

TYPE: bool | None | VariableType DEFAULT: None

autoscale

Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.

TYPE: ClusterAutoscale | None | VariableType DEFAULT: None

autotermination_minutes

Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.

TYPE: int | None | VariableType DEFAULT: None

aws_attributes

Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.

TYPE: ClusterAwsAttributes | None | VariableType DEFAULT: None

azure_attributes

Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.

TYPE: ClusterAzureAttributes | None | VariableType DEFAULT: None

cluster_log_conf

The configuration for delivering spark logs to a long-term storage destination. Three kinds of destinations (DBFS, S3 and Unity Catalog volumes) are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every 5 mins. The destination of driver logs is $destination/$clusterId/driver, while the destination of executor logs is $destination/$clusterId/executor.

TYPE: ClusterClusterLogConf | None | VariableType DEFAULT: None

cluster_mount_info

TYPE: list[ClusterClusterMountInfo] | None | VariableType DEFAULT: None

cluster_name

Cluster name, which doesn't have to be unique.

TYPE: str | None | VariableType DEFAULT: None

custom_tags

Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to default_tags. Notes:

TYPE: dict[str, str] | None | VariableType DEFAULT: None

data_security_mode

TYPE: str | None | VariableType DEFAULT: None

docker_image

Custom docker image BYOC

TYPE: ClusterDockerImage | None | VariableType DEFAULT: None

driver_instance_pool_id

The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.

TYPE: str | None | VariableType DEFAULT: None

driver_node_type_flexibility

Flexible node type configuration for the driver node.

TYPE: ClusterDriverNodeTypeFlexibility | None | VariableType DEFAULT: None

driver_node_type_id

The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as node_type_id defined above.

TYPE: str | None | VariableType DEFAULT: None

enable_elastic_disk

Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space.

TYPE: bool | None | VariableType DEFAULT: None

enable_local_disk_encryption

Whether to enable LUKS on cluster VMs' local disks

TYPE: bool | None | VariableType DEFAULT: None

gcp_attributes

Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.

TYPE: ClusterGcpAttributes | None | VariableType DEFAULT: None

idempotency_token

TYPE: str | None | VariableType DEFAULT: None

init_scripts

The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If cluster_log_conf is specified, init script logs are sent to <destination>/<cluster-ID>/init_scripts.

TYPE: list[ClusterInitScripts] | None | VariableType DEFAULT: None

instance_pool_id

The optional ID of the instance pool to which the cluster belongs.

TYPE: str | None | VariableType DEFAULT: None

is_pinned

TYPE: bool | None | VariableType DEFAULT: None

is_single_node

This field can only be used when kind = CLASSIC_PREVIEW.

TYPE: bool | None | VariableType DEFAULT: None

kind

TYPE: str | None | VariableType DEFAULT: None

library

TYPE: list[ClusterLibrary] | None | VariableType DEFAULT: None

no_wait

TYPE: bool | None | VariableType DEFAULT: None

node_type_id

This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.

TYPE: str | None | VariableType DEFAULT: None

num_workers

Number of worker nodes that this cluster should have. A cluster has one Spark Driver and num_workers Executors for a total of num_workers + 1 Spark nodes.

TYPE: int | None | VariableType DEFAULT: None

policy_id

The ID of the cluster policy used to create the cluster if applicable.

TYPE: str | None | VariableType DEFAULT: None

remote_disk_throughput

If set, what the configurable throughput (in Mb/s) for the remote disk is. Currently only supported for GCP HYPERDISK_BALANCED disks.

TYPE: int | None | VariableType DEFAULT: None

runtime_engine

Determines the cluster's runtime engine, either standard or Photon.

TYPE: str | None | VariableType DEFAULT: None

single_user_name

Single user name if data_security_mode is SINGLE_USER

TYPE: str | None | VariableType DEFAULT: None

spark_conf

An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively.

TYPE: dict[str, str] | None | VariableType DEFAULT: None

spark_env_vars

An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e., export X='Y') while launching the driver and workers.

TYPE: dict[str, str] | None | VariableType DEFAULT: None

spark_version

The Spark version of the cluster, e.g. 3.3.x-scala2.11. A list of available Spark versions can be retrieved by using the :method:clusters/sparkVersions API call.

TYPE: str | VariableType

ssh_public_keys

SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name ubuntu on port 2200. Up to 10 keys can be specified.

TYPE: list[str] | None | VariableType DEFAULT: None

timeouts

TYPE: ClusterTimeouts | None | VariableType DEFAULT: None

total_initial_remote_disk_size

If set, what the total initial volume size (in GB) of the remote disks should be. Currently only supported for GCP HYPERDISK_BALANCED disks.

TYPE: int | None | VariableType DEFAULT: None

use_ml_runtime

This field can only be used when kind = CLASSIC_PREVIEW.

TYPE: bool | None | VariableType DEFAULT: None

worker_node_type_flexibility

Flexible node type configuration for worker nodes.

TYPE: ClusterWorkerNodeTypeFlexibility | None | VariableType DEFAULT: None

workload_type

TYPE: ClusterWorkloadType | None | VariableType DEFAULT: None

LAKTORY DESCRIPTION
access_controls

List of access controls

TYPE: list[AccessControl | VariableType] | VariableType DEFAULT: []

ATTRIBUTE DESCRIPTION
additional_core_resources
  • permissions

TYPE: list

additional_core_resources property ¤

  • permissions

laktory.models.resources.databricks.cluster.ClusterAutoscale ¤

Bases: BaseModel

PARAMETER DESCRIPTION
max_workers

TYPE: int | None | VariableType DEFAULT: None

min_workers

TYPE: int | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterAwsAttributes ¤

Bases: BaseModel

PARAMETER DESCRIPTION
availability

TYPE: str | None | VariableType DEFAULT: None

ebs_volume_count

TYPE: int | None | VariableType DEFAULT: None

ebs_volume_iops

TYPE: int | None | VariableType DEFAULT: None

ebs_volume_size

TYPE: int | None | VariableType DEFAULT: None

ebs_volume_throughput

TYPE: int | None | VariableType DEFAULT: None

ebs_volume_type

TYPE: str | None | VariableType DEFAULT: None

first_on_demand

TYPE: int | None | VariableType DEFAULT: None

instance_profile_arn

TYPE: str | None | VariableType DEFAULT: None

spot_bid_price_percent

TYPE: int | None | VariableType DEFAULT: None

zone_id

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterAzureAttributes ¤

Bases: BaseModel

PARAMETER DESCRIPTION
availability

TYPE: str | None | VariableType DEFAULT: None

first_on_demand

TYPE: int | None | VariableType DEFAULT: None

log_analytics_info

TYPE: ClusterAzureAttributesLogAnalyticsInfo | None | VariableType DEFAULT: None

spot_bid_max_price

TYPE: float | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterAzureAttributesLogAnalyticsInfo ¤

Bases: BaseModel

PARAMETER DESCRIPTION
log_analytics_primary_key

TYPE: str | None | VariableType DEFAULT: None

log_analytics_workspace_id

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterClusterLogConf ¤

Bases: BaseModel

PARAMETER DESCRIPTION
dbfs

TYPE: ClusterClusterLogConfDbfs | None | VariableType DEFAULT: None

s3

TYPE: ClusterClusterLogConfS3 | None | VariableType DEFAULT: None

volumes

TYPE: ClusterClusterLogConfVolumes | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterClusterLogConfDbfs ¤

Bases: BaseModel

PARAMETER DESCRIPTION
destination

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterClusterLogConfS3 ¤

Bases: BaseModel

PARAMETER DESCRIPTION
canned_acl

TYPE: str | None | VariableType DEFAULT: None

destination

TYPE: str | VariableType

enable_encryption

TYPE: bool | None | VariableType DEFAULT: None

encryption_type

TYPE: str | None | VariableType DEFAULT: None

endpoint

TYPE: str | None | VariableType DEFAULT: None

kms_key

TYPE: str | None | VariableType DEFAULT: None

region

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterClusterLogConfVolumes ¤

Bases: BaseModel

PARAMETER DESCRIPTION
destination

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterClusterMountInfo ¤

Bases: BaseModel

PARAMETER DESCRIPTION
local_mount_dir_path

TYPE: str | VariableType

network_filesystem_info

TYPE: ClusterClusterMountInfoNetworkFilesystemInfo | None | VariableType DEFAULT: None

remote_mount_dir_path

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterClusterMountInfoNetworkFilesystemInfo ¤

Bases: BaseModel

PARAMETER DESCRIPTION
mount_options

TYPE: str | None | VariableType DEFAULT: None

server_address

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterDockerImage ¤

Bases: BaseModel

PARAMETER DESCRIPTION
basic_auth

TYPE: ClusterDockerImageBasicAuth | None | VariableType DEFAULT: None

url

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterDockerImageBasicAuth ¤

Bases: BaseModel

PARAMETER DESCRIPTION
password

TYPE: str | VariableType

username

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterDriverNodeTypeFlexibility ¤

Bases: BaseModel

PARAMETER DESCRIPTION
alternate_node_type_ids

TYPE: list[str] | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterGcpAttributes ¤

Bases: BaseModel

PARAMETER DESCRIPTION
availability

TYPE: str | None | VariableType DEFAULT: None

boot_disk_size

TYPE: int | None | VariableType DEFAULT: None

first_on_demand

TYPE: int | None | VariableType DEFAULT: None

google_service_account

TYPE: str | None | VariableType DEFAULT: None

local_ssd_count

TYPE: int | None | VariableType DEFAULT: None

use_preemptible_executors

TYPE: bool | None | VariableType DEFAULT: None

zone_id

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterInitScripts ¤

Bases: BaseModel

PARAMETER DESCRIPTION
abfss

TYPE: ClusterInitScriptsAbfss | None | VariableType DEFAULT: None

dbfs

TYPE: ClusterInitScriptsDbfs | None | VariableType DEFAULT: None

file

TYPE: ClusterInitScriptsFile | None | VariableType DEFAULT: None

gcs

TYPE: ClusterInitScriptsGcs | None | VariableType DEFAULT: None

s3

TYPE: ClusterInitScriptsS3 | None | VariableType DEFAULT: None

volumes

TYPE: ClusterInitScriptsVolumes | None | VariableType DEFAULT: None

workspace

TYPE: ClusterInitScriptsWorkspace | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterInitScriptsAbfss ¤

Bases: BaseModel

PARAMETER DESCRIPTION
destination

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterInitScriptsDbfs ¤

Bases: BaseModel

PARAMETER DESCRIPTION
destination

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterInitScriptsFile ¤

Bases: BaseModel

PARAMETER DESCRIPTION
destination

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterInitScriptsGcs ¤

Bases: BaseModel

PARAMETER DESCRIPTION
destination

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterInitScriptsS3 ¤

Bases: BaseModel

PARAMETER DESCRIPTION
canned_acl

TYPE: str | None | VariableType DEFAULT: None

destination

TYPE: str | VariableType

enable_encryption

TYPE: bool | None | VariableType DEFAULT: None

encryption_type

TYPE: str | None | VariableType DEFAULT: None

endpoint

TYPE: str | None | VariableType DEFAULT: None

kms_key

TYPE: str | None | VariableType DEFAULT: None

region

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterInitScriptsVolumes ¤

Bases: BaseModel

PARAMETER DESCRIPTION
destination

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterInitScriptsWorkspace ¤

Bases: BaseModel

PARAMETER DESCRIPTION
destination

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterLibrary ¤

Bases: BaseModel

PARAMETER DESCRIPTION
cran

TYPE: ClusterLibraryCran | None | VariableType DEFAULT: None

egg

TYPE: str | None | VariableType DEFAULT: None

jar

TYPE: str | None | VariableType DEFAULT: None

maven

TYPE: ClusterLibraryMaven | None | VariableType DEFAULT: None

pypi

TYPE: ClusterLibraryPypi | None | VariableType DEFAULT: None

requirements

TYPE: str | None | VariableType DEFAULT: None

whl

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterLibraryCran ¤

Bases: BaseModel

PARAMETER DESCRIPTION
package

TYPE: str | VariableType

repo

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterLibraryMaven ¤

Bases: BaseModel

PARAMETER DESCRIPTION
coordinates

TYPE: str | VariableType

exclusions

TYPE: list[str] | None | VariableType DEFAULT: None

repo

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterLibraryPypi ¤

Bases: BaseModel

PARAMETER DESCRIPTION
package

TYPE: str | VariableType

repo

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterLookup ¤

Bases: ResourceLookup

PARAMETER DESCRIPTION
cluster_id

The id of the cluster

TYPE: str | VariableType


laktory.models.resources.databricks.cluster.ClusterTimeouts ¤

Bases: BaseModel

PARAMETER DESCRIPTION
create

TYPE: str | None | VariableType DEFAULT: None

delete

TYPE: str | None | VariableType DEFAULT: None

update_

TYPE: str | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterWorkerNodeTypeFlexibility ¤

Bases: BaseModel

PARAMETER DESCRIPTION
alternate_node_type_ids

TYPE: list[str] | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterWorkloadType ¤

Bases: BaseModel

PARAMETER DESCRIPTION
clients

TYPE: ClusterWorkloadTypeClients | None | VariableType DEFAULT: None


laktory.models.resources.databricks.cluster.ClusterWorkloadTypeClients ¤

Bases: BaseModel

PARAMETER DESCRIPTION
jobs

TYPE: bool | None | VariableType DEFAULT: None

notebooks

TYPE: bool | None | VariableType DEFAULT: None