Skip to main content

Databricks - Download Files from DBFS to Shipyard

Overview​

Quickly export one or more files from your Databricks File System (DBFS). The match type selected greatly affects how this Blueprint works.

Variables​

NameReferenceTypeRequiredDefaultOptionsDescription
Databricks Folder NameDATABRICKS_SOURCE_FOLDER_NAMEAlphanumericβž–--Name of the folder where the file is stored in the Databricks File System (DBFS). If left blank, looks in the /FileStore/.
Databricks File Name Match TypeDATABRICKS_SOURCE_FILE_NAME_MATCH_TYPESelectβœ…exact_matchExact Match: exact_match

Regex Match: regex_match

Determines if the text in "Databricks File Name" will look for one file with exact match, or multiple files using regex.
Databricks File NameDATABRICKS_SOURCE_FILE_NAMEAlphanumericβœ…--Name of the target file in the Databricks File System (DBFS). Can be regex if "Match Type" is set accordingly.
Shipyard Folder NameDATABRICKS_DESTINATION_FOLDER_NAMEAlphanumericβž–--Folder where the file(s) should be downloaded on Shipyard. Leaving blank will place the file in the home directory.
Shipyard File NameDATABRICKS_DESTINATION_FILE_NAMEAlphanumericβž–--What to name the file(s) being downloaded on Shipyard. If left blank, defaults to the original file name(s).
Workspace Instance URLDATABRICKS_INSTANCE_URLAlphanumericβœ…--The subdomain, domain, and top-level domain (TLD) of your Databricks Workspace URL.
Access TokenDATABRICKS_ACCESS_TOKENPasswordβœ…--The personal access token associated with the provided Workspace Instance.

YAML​

Below is the YAML template for this Blueprint and can be used in the Fleet YAML Editor.

source:
blueprint: Databricks - Download Files from DBFS to Shipyard
inputs:
DATABRICKS_SOURCE_FOLDER_NAME: null
DATABRICKS_SOURCE_FILE_NAME_MATCH_TYPE: exact_match ## REQUIRED
DATABRICKS_SOURCE_FILE_NAME: null ## REQUIRED
DATABRICKS_DESTINATION_FOLDER_NAME: null
DATABRICKS_DESTINATION_FILE_NAME: null
DATABRICKS_INSTANCE_URL: null ## REQUIRED
DATABRICKS_ACCESS_TOKEN: null ## REQUIRED
type: BLUEPRINT
guardrails:
retry_count: 1
retry_wait: 0h0m0s
runtime_cutoff: 1h0m0s
exclude_exit_code_ranges:
- '200'
- '201'
- '202'
- '203'
- '212'
- '214'