Databricks - Download Files from DBFS to Platform
Overviewβ
Quickly export one or more files from your Databricks File System (DBFS). The match type selected greatly affects how this Blueprint works.
Variablesβ
| Name | Reference | Type | Required | Default | Options | Description |
|---|---|---|---|---|---|---|
| Databricks Folder Name | DATABRICKS_SOURCE_FOLDER_NAME | Alphanumeric | β | - | - | Name of the folder where the file is stored in the Databricks File System (DBFS). If left blank, looks in the /FileStore/. |
| Databricks File Name Match Type | DATABRICKS_SOURCE_FILE_NAME_MATCH_TYPE | Select | β | exact_match | Exact Match: exact_matchRegex Match: regex_match | Determines if the text in "Databricks File Name" will look for one file with exact match, or multiple files using regex. |
| Databricks File Name | DATABRICKS_SOURCE_FILE_NAME | Alphanumeric | β | - | - | Name of the target file in the Databricks File System (DBFS). Can be regex if "Match Type" is set accordingly. |
| Shipyard Folder Name | DATABRICKS_DESTINATION_FOLDER_NAME | Alphanumeric | β | - | - | Folder where the file(s) should be downloaded on Platform. Leaving blank will place the file in the home directory. |
| Shipyard File Name | DATABRICKS_DESTINATION_FILE_NAME | Alphanumeric | β | - | - | What to name the file(s) being downloaded on Platform. If left blank, defaults to the original file name(s). |
| Workspace Instance URL | DATABRICKS_INSTANCE_URL | Alphanumeric | β | - | - | The subdomain, domain, and top-level domain (TLD) of your Databricks Workspace URL. |
| Access Token | DATABRICKS_ACCESS_TOKEN | Password | β | - | - | The personal access token associated with the provided Workspace Instance. |
YAMLβ
Below is the YAML template for this Blueprint and can be used in the Fleet YAML Editor.
source:
blueprint: Databricks - Download Files from DBFS to Shipyard
inputs:
DATABRICKS_SOURCE_FOLDER_NAME: null
DATABRICKS_SOURCE_FILE_NAME_MATCH_TYPE: exact_match ## REQUIRED
DATABRICKS_SOURCE_FILE_NAME: null ## REQUIRED
DATABRICKS_DESTINATION_FOLDER_NAME: null
DATABRICKS_DESTINATION_FILE_NAME: null
DATABRICKS_INSTANCE_URL: null ## REQUIRED
DATABRICKS_ACCESS_TOKEN: null ## REQUIRED
type: BLUEPRINT
guardrails:
retry_count: 1
retry_wait: 0h0m0s
runtime_cutoff: 1h0m0s
exclude_exit_code_ranges:
- '200'
- '201'
- '202'
- '203'
- '212'
- '214'