Databricks Asset Bundles resources
Databricks Asset Bundles allows you to specify information about the Azure Databricks resources used by the bundle in the resources
mapping in the bundle configuration. See resources mapping and resources key reference.
This article outlines supported resource types for bundles and provides details and an example for each supported type. For additional examples, see Bundle configuration examples.
Tip
To generate YAML for any existing resource, use the databricks bundle generate
command. See Generate a bundle configuration file.
Supported resources
The following table lists supported resource types for bundles. Some resources can be created by defining them in a bundle and deploying the bundle, and some resources only support referencing an existing resource to include in the bundle.
Resources are defined using the corresponding Databricks REST API object’s create operation request payload, where the object’s supported fields, expressed as YAML, are the resource’s supported properties. Links to documentation for each resource’s corresponding payloads are listed in the table.
Tip
The databricks bundle validate
command returns warnings if unknown resource properties are found in bundle configuration files.
Resource | Create support | Corresponding REST API object |
---|---|---|
app | ✓ | App object |
cluster | ✓ | Cluster object |
dashboard | Dashboard object | |
experiment | ✓ | Experiment object |
job | ✓ | Job object |
model (legacy) | ✓ | Model (legacy) object |
model_serving_endpoint | ✓ | Model serving endpoint object |
pipeline | ✓ | [Pipeline object]](https://docs.databricks.com/api/azure/workspace/pipelines/create) |
quality_monitor | ✓ | Quality monitor object |
registered_model (Unity Catalog) | ✓ | Registered model object |
schema (Unity Catalog) | ✓ | Schema object |
volume (Unity Catalog) | ✓ | Volume object |
app
The app resource defines a Databricks app. For information about Databricks Apps, see What is Databricks Apps?.
Tip
You can initialize a bundle with a Streamlit Databricks app using the following command:
databricks bundle init https://github.com/databricks/bundle-examples --template-dir contrib/templates/streamlit-app
To add an app, specify the object fields that define the app, as well as the following:
source_code_path
- The./app
local path of the Databricks app source code. This field is required.config
- The app configuration commands and environment variables. You can use this to specify different app deployment targets.
Example
The following example creates an app named my_app
that manages a job created by the bundle:
resources:
jobs:
# Define a job in the bundle
hello_world:
name: hello_world
tasks:
- task_key: task
spark_python_task:
python_file: ../src/main.py
environment_key: default
environments:
- environment_key: default
spec:
client: "1"
# Define an app that manages the job in the bundle
apps:
job_manager:
name: "job_manager_app"
description: "An app which manages a job created by this bundle"
# The location of the source code for the app
source_code_path: ../src/app
# The configuration for running the app
config:
command:
- flask
- --app
- app
- run
- --debug
env:
- name: JOB_ID
value: ${resources.jobs.hello_world.id}
# The resources in the bundle which this app has access to. This binds the resource in the app with the DABs resource.
resources:
- name: "app-job"
job:
id: ${resources.jobs.hello_world.id}
permission: "CAN_MANAGE_RUN"
For the complete Databricks app example bundle, see the bundle-examples GitHub repository.
cluster
The cluster resource defines an all-purpose cluster.
Example
The following example creates a cluster named my_cluster
and sets that as the cluster to use to run the notebook in my_job
:
bundle:
name: clusters
resources:
clusters:
my_cluster:
num_workers: 2
node_type_id: "i3.xlarge"
autoscale:
min_workers: 2
max_workers: 7
spark_version: "13.3.x-scala2.12"
spark_conf:
"spark.executor.memory": "2g"
jobs:
my_job:
tasks:
- task_key: test_task
notebook_task:
notebook_path: "./src/my_notebook.py"
dashboard
The dashboard resource allows you to manage AI/BI dashboards in a bundle. For information about AI/BI dashboards, see Dashboards.
Example
The following example includes and deploys the sample NYC Taxi Trip Analysis dashboard to the Databricks workspace.
resources:
dashboards:
nyc_taxi_trip_analysis:
display_name: "NYC Taxi Trip Analysis"
file_path: ../src/nyc_taxi_trip_analysis.lvdash.json
warehouse_id: ${var.warehouse_id}
If you use the UI to modify the dashboard, modifications made through the UI are not applied to the dashboard JSON file in the local bundle unless you explicitly update it using bundle generate
. You can use the --watch
option to continuously poll and retrieve changes to the dashboard. See Generate a bundle configuration file.
In addition, if you attempt to deploy a bundle that contains a dashboard JSON file that is different than the one in the remote workspace, an error will occur. To force the deploy and overwrite the dashboard in the remote workspace with the local one, use the --force
option. See Deploy a bundle.
experiment
The experiment resource allows you to define MLflow experiments in a bundle. For information about MLflow experiments, see Organize training runs with MLflow experiments.
Example
The following example defines an experiment that all users can view:
resources:
experiments:
experiment:
name: my_ml_experiment
permissions:
- level: CAN_READ
group_name: users
description: MLflow experiment used to track runs
job
The job resource allows you to define jobs and their corresponding tasks in your bundle. For information about jobs, see Schedule and orchestrate workflows. For a tutorial that uses a Databricks Asset Bundles template to create a job, see Develop a job on Azure Databricks using Databricks Asset Bundles.
Example
The following example defines a job with the resource key hello-job
with one notebook task:
resources:
jobs:
hello-job:
name: hello-job
tasks:
- task_key: hello-task
notebook_task:
notebook_path: ./hello.py
For information about defining job tasks and overriding job settings, see Add tasks to jobs in Databricks Asset Bundles, Override job tasks settings in Databricks Asset Bundles, and Override cluster settings in Databricks Asset Bundles.
model (legacy)
The model resource allows you to define legacy models in bundles. Databricks recommends you use Unity Catalog registered models instead.
model_serving_endpoint
The model_serving_endpoint resource allows you to define model serving endpoints. See Manage model serving endpoints.
Example
The following example defines a Unity Catalog model serving endpoint:
resources:
model_serving_endpoints:
uc_model_serving_endpoint:
name: "uc-model-endpoint"
config:
served_entities:
- entity_name: "myCatalog.mySchema.my-ads-model"
entity_version: "10"
workload_size: "Small"
scale_to_zero_enabled: "true"
traffic_config:
routes:
- served_model_name: "my-ads-model-10"
traffic_percentage: "100"
tags:
- key: "team"
value: "data science"
quality_monitor (Unity Catalog)
The quality_monitor resource allows you to define a Unity Catalog table monitor. For information about monitors, see Monitor model quality and endpoint health.
Example
The following example defines a quality monitor:
resources:
quality_monitors:
my_quality_monitor:
table_name: dev.mlops_schema.predictions
output_schema_name: ${bundle.target}.mlops_schema
assets_dir: /Users/${workspace.current_user.userName}/databricks_lakehouse_monitoring
inference_log:
granularities: [1 day]
model_id_col: model_id
prediction_col: prediction
label_col: price
problem_type: PROBLEM_TYPE_REGRESSION
timestamp_col: timestamp
schedule:
quartz_cron_expression: 0 0 8 * * ? # Run Every day at 8am
timezone_id: UTC
registered_model (Unity Catalog)
The registered model resource allows you to define models in Unity Catalog. For information about Unity Catalog registered models, see Manage model lifecycle in Unity Catalog.
Example
The following example defines a registered model in Unity Catalog:
resources:
registered_models:
model:
name: my_model
catalog_name: ${bundle.target}
schema_name: mlops_schema
comment: Registered model in Unity Catalog for ${bundle.target} deployment target
grants:
- privileges:
- EXECUTE
principal: account users
pipeline
The pipeline resource allows you to create Delta Live Tables pipelines. For information about pipelines, see What is Delta Live Tables?. For a tutorial that uses the Databricks Asset Bundles template to create a pipeline, see Develop Delta Live Tables pipelines with Databricks Asset Bundles.
Example
The following example defines a pipeline with the resource key hello-pipeline
:
resources:
pipelines:
hello-pipeline:
name: hello-pipeline
clusters:
- label: default
num_workers: 1
development: true
continuous: false
channel: CURRENT
edition: CORE
photon: false
libraries:
- notebook:
path: ./pipeline.py
schema (Unity Catalog)
The schema resource type allows you to define Unity Catalog schemas for tables and other assets in your workflows and pipelines created as part of a bundle. A schema, different from other resource types, has the following limitations:
- The owner of a schema resource is always the deployment user, and cannot be changed. If
run_as
is specified in the bundle, it will be ignored by operations on the schema. - Only fields supported by the corresponding Schemas object create API are available for the schema resource. For example,
enable_predictive_optimization
is not supported as it is only available on the update API.
Examples
The following example defines a pipeline with the resource key my_pipeline
that creates a Unity Catalog schema with the key my_schema
as the target:
resources:
pipelines:
my_pipeline:
name: test-pipeline-{{.unique_id}}
libraries:
- notebook:
path: ./nb.sql
development: true
catalog: main
target: ${resources.schemas.my_schema.id}
schemas:
my_schema:
name: test-schema-{{.unique_id}}
catalog_name: main
comment: This schema was created by DABs.
A top-level grants mapping is not supported by Databricks Asset Bundles, so if you want to set grants for a schema, define the grants for the schema within the schemas
mapping. For more information about grants, see Show, grant, and revoke privileges.
The following example defines a Unity Catalog schema with grants:
resources:
schemas:
my_schema:
name: test-schema
grants:
- principal: users
privileges:
- CAN_MANAGE
- principal: my_team
privileges:
- CAN_READ
catalog_name: main
volume (Unity Catalog)
The volume resource type allows you to define and create Unity Catalog volumes as part of a bundle. When deploying a bundle with a volume defined, note that:
- A volume cannot be referenced in the
artifact_path
for the bundle until it exists in the workspace. Hence, if you want to use Databricks Asset Bundles to create the volume, you must first define the volume in the bundle, deploy it to create the volume, then reference it in theartifact_path
in subsequent deployments. - Volumes in the bundle are not prepended with the
dev_${workspace.current_user.short_name}
prefix when the deployment target hasmode: development
configured. However, you can manually configure this prefix. See Custom presets.
Example
The following example creates a Unity Catalog volume with the key my_volume
:
resources:
volumes:
my_volume:
catalog_name: main
name: my_volume
schema_name: my_schema
For an example bundle that runs a job that writes to a file in Unity Catalog volume, see the bundle-examples GitHub repository.