Cluster
laktory.models.resources.databricks.Cluster
¤
Bases: ClusterBase
Databricks cluster
Examples:
import io
from laktory import models
cluster_yaml = '''
name: default
spark_version: 16.3.x-scala2.12
data_security_mode: USER_ISOLATION
node_type_id: Standard_DS3_v2
autoscale:
min_workers: 1
max_workers: 4
autotermination_minutes: 30
libraries:
- pypi:
package: laktory==0.0.23
access_controls:
- group_name: role-engineers
permission_level: CAN_RESTART
is_pinned: true
'''
cluster = models.resources.databricks.Cluster.model_validate_yaml(
io.StringIO(cluster_yaml)
)
References
| BASE | DESCRIPTION |
|---|---|
apply_policy_default_values
|
When set to true, fixed and default values from the policy will be used for fields that are omitted. When set to false, only fixed values from the policy will be applied.
TYPE:
|
autoscale
|
Parameters needed in order to automatically scale clusters up and down based on load. Note: autoscaling works best with DB runtime versions 3.0 or later.
TYPE:
|
autotermination_minutes
|
Automatically terminates the cluster after it is inactive for this time in minutes. If not set, this cluster will not be automatically terminated. If specified, the threshold must be between 10 and 10000 minutes. Users can also set this value to 0 to explicitly disable automatic termination.
TYPE:
|
aws_attributes
|
Attributes related to clusters running on Amazon Web Services. If not specified at cluster creation, a set of default values will be used.
TYPE:
|
azure_attributes
|
Attributes related to clusters running on Microsoft Azure. If not specified at cluster creation, a set of default values will be used.
TYPE:
|
cluster_log_conf
|
The configuration for delivering spark logs to a long-term storage destination. Three kinds of destinations (DBFS, S3 and Unity Catalog volumes) are supported. Only one destination can be specified for one cluster. If the conf is given, the logs will be delivered to the destination every
TYPE:
|
cluster_mount_info
|
TYPE:
|
cluster_name
|
Cluster name, which doesn't have to be unique.
TYPE:
|
custom_tags
|
Additional tags for cluster resources. Databricks will tag all cluster resources (e.g., AWS instances and EBS volumes) with these tags in addition to
TYPE:
|
data_security_mode
|
TYPE:
|
docker_image
|
Custom docker image BYOC
TYPE:
|
driver_instance_pool_id
|
The optional ID of the instance pool for the driver of the cluster belongs. The pool cluster uses the instance pool with id (instance_pool_id) if the driver pool is not assigned.
TYPE:
|
driver_node_type_flexibility
|
Flexible node type configuration for the driver node.
TYPE:
|
driver_node_type_id
|
The node type of the Spark driver. Note that this field is optional; if unset, the driver node type will be set as the same value as
TYPE:
|
enable_elastic_disk
|
Autoscaling Local Storage: when enabled, this cluster will dynamically acquire additional disk space when its Spark workers are running low on disk space.
TYPE:
|
enable_local_disk_encryption
|
Whether to enable LUKS on cluster VMs' local disks
TYPE:
|
gcp_attributes
|
Attributes related to clusters running on Google Cloud Platform. If not specified at cluster creation, a set of default values will be used.
TYPE:
|
idempotency_token
|
TYPE:
|
init_scripts
|
The configuration for storing init scripts. Any number of destinations can be specified. The scripts are executed sequentially in the order provided. If
TYPE:
|
instance_pool_id
|
The optional ID of the instance pool to which the cluster belongs.
TYPE:
|
is_pinned
|
TYPE:
|
is_single_node
|
This field can only be used when
TYPE:
|
kind
|
TYPE:
|
library
|
TYPE:
|
no_wait
|
TYPE:
|
node_type_id
|
This field encodes, through a single value, the resources available to each of the Spark nodes in this cluster. For example, the Spark nodes can be provisioned and optimized for memory or compute intensive workloads. A list of available node types can be retrieved by using the :method:clusters/listNodeTypes API call.
TYPE:
|
num_workers
|
Number of worker nodes that this cluster should have. A cluster has one Spark Driver and
TYPE:
|
policy_id
|
The ID of the cluster policy used to create the cluster if applicable.
TYPE:
|
remote_disk_throughput
|
If set, what the configurable throughput (in Mb/s) for the remote disk is. Currently only supported for GCP HYPERDISK_BALANCED disks.
TYPE:
|
runtime_engine
|
Determines the cluster's runtime engine, either standard or Photon.
TYPE:
|
single_user_name
|
Single user name if data_security_mode is
TYPE:
|
spark_conf
|
An object containing a set of optional, user-specified Spark configuration key-value pairs. Users can also pass in a string of extra JVM options to the driver and the executors via
TYPE:
|
spark_env_vars
|
An object containing a set of optional, user-specified environment variable key-value pairs. Please note that key-value pair of the form (X,Y) will be exported as is (i.e.,
TYPE:
|
spark_version
|
The Spark version of the cluster, e.g.
TYPE:
|
ssh_public_keys
|
SSH public key contents that will be added to each Spark node in this cluster. The corresponding private keys can be used to login with the user name
TYPE:
|
timeouts
|
TYPE:
|
total_initial_remote_disk_size
|
If set, what the total initial volume size (in GB) of the remote disks should be. Currently only supported for GCP HYPERDISK_BALANCED disks.
TYPE:
|
use_ml_runtime
|
This field can only be used when
TYPE:
|
worker_node_type_flexibility
|
Flexible node type configuration for worker nodes.
TYPE:
|
workload_type
|
TYPE:
|
| LAKTORY | DESCRIPTION |
|---|---|
access_controls
|
List of access controls
TYPE:
|
| ATTRIBUTE | DESCRIPTION |
|---|---|
additional_core_resources |
TYPE:
|
additional_core_resources
property
¤
- permissions
laktory.models.resources.databricks.cluster.ClusterAutoscale
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
max_workers
|
TYPE:
|
min_workers
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterAwsAttributes
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
availability
|
TYPE:
|
ebs_volume_count
|
TYPE:
|
ebs_volume_iops
|
TYPE:
|
ebs_volume_size
|
TYPE:
|
ebs_volume_throughput
|
TYPE:
|
ebs_volume_type
|
TYPE:
|
first_on_demand
|
TYPE:
|
instance_profile_arn
|
TYPE:
|
spot_bid_price_percent
|
TYPE:
|
zone_id
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterAzureAttributes
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
availability
|
TYPE:
|
first_on_demand
|
TYPE:
|
log_analytics_info
|
TYPE:
|
spot_bid_max_price
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterAzureAttributesLogAnalyticsInfo
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
log_analytics_primary_key
|
TYPE:
|
log_analytics_workspace_id
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterClusterLogConf
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
dbfs
|
TYPE:
|
s3
|
TYPE:
|
volumes
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterClusterLogConfDbfs
¤
laktory.models.resources.databricks.cluster.ClusterClusterLogConfS3
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
canned_acl
|
TYPE:
|
destination
|
TYPE:
|
enable_encryption
|
TYPE:
|
encryption_type
|
TYPE:
|
endpoint
|
TYPE:
|
kms_key
|
TYPE:
|
region
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterClusterLogConfVolumes
¤
laktory.models.resources.databricks.cluster.ClusterClusterMountInfo
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
local_mount_dir_path
|
TYPE:
|
network_filesystem_info
|
TYPE:
|
remote_mount_dir_path
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterClusterMountInfoNetworkFilesystemInfo
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
mount_options
|
TYPE:
|
server_address
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterDockerImage
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
basic_auth
|
TYPE:
|
url
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterDockerImageBasicAuth
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
password
|
TYPE:
|
username
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterDriverNodeTypeFlexibility
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
alternate_node_type_ids
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterGcpAttributes
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
availability
|
TYPE:
|
boot_disk_size
|
TYPE:
|
first_on_demand
|
TYPE:
|
google_service_account
|
TYPE:
|
local_ssd_count
|
TYPE:
|
use_preemptible_executors
|
TYPE:
|
zone_id
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterInitScripts
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
abfss
|
TYPE:
|
dbfs
|
TYPE:
|
file
|
TYPE:
|
gcs
|
TYPE:
|
s3
|
TYPE:
|
volumes
|
TYPE:
|
workspace
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterInitScriptsAbfss
¤
laktory.models.resources.databricks.cluster.ClusterInitScriptsDbfs
¤
laktory.models.resources.databricks.cluster.ClusterInitScriptsFile
¤
laktory.models.resources.databricks.cluster.ClusterInitScriptsGcs
¤
laktory.models.resources.databricks.cluster.ClusterInitScriptsS3
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
canned_acl
|
TYPE:
|
destination
|
TYPE:
|
enable_encryption
|
TYPE:
|
encryption_type
|
TYPE:
|
endpoint
|
TYPE:
|
kms_key
|
TYPE:
|
region
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterInitScriptsVolumes
¤
laktory.models.resources.databricks.cluster.ClusterInitScriptsWorkspace
¤
laktory.models.resources.databricks.cluster.ClusterLibrary
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
cran
|
TYPE:
|
egg
|
TYPE:
|
jar
|
TYPE:
|
maven
|
TYPE:
|
pypi
|
TYPE:
|
requirements
|
TYPE:
|
whl
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterLibraryCran
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
package
|
TYPE:
|
repo
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterLibraryMaven
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
coordinates
|
TYPE:
|
exclusions
|
TYPE:
|
repo
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterLibraryPypi
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
package
|
TYPE:
|
repo
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterLookup
¤
Bases: ResourceLookup
| PARAMETER | DESCRIPTION |
|---|---|
cluster_id
|
The id of the cluster
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterTimeouts
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
create
|
TYPE:
|
delete
|
TYPE:
|
update_
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterWorkerNodeTypeFlexibility
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
alternate_node_type_ids
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterWorkloadType
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
clients
|
TYPE:
|
laktory.models.resources.databricks.cluster.ClusterWorkloadTypeClients
¤
Bases: BaseModel
| PARAMETER | DESCRIPTION |
|---|---|
jobs
|
TYPE:
|
notebooks
|
TYPE:
|