Skip to main content

File Manipulation - Convert CSV

Overview​

Convert one or more CSV files into a TSV, PSV, XLSX, Parquet, DTA, or HDF5 file.

The match type selected greatly affects how this Blueprint works.

This Blueprint uses built-in Pandas methods to read in a CSV file and output it as another file type.


For more information on how to use this Blueprint, read the documentation. You can also dig into the open-source code on Github.

Variables​

NameReferenceTypeRequiredDefaultOptionsDescription
Destination File FormatMANIPULATION_DESTINATION_FILE_FORMATSelectβœ…tsvTab-Separated File (.tsv): tsv

Pipe-Separated File (.psv): psv

Excel File (.xlsx): xlsx

Parquet (.parquet): parquet

Stata (.dta): stata

HDF5 (.h5): hdf5

Type of file that you want the CSV file(s) converted into.
Local Folder NameMANIPULATION_SOURCE_FOLDER_NAMEAlphanumericβž–--Name of the local folder on Shipyard where the target file lives. If left blank, will look in the home directory.
Local File Name Match TypeMANIPULATION_SOURCE_FILE_NAME_MATCH_TYPESelectβœ…exact_matchExact Match: exact_match

Regex Match: regex_match

Determines if the text in "Local File Name" will look for one file with exact match, or multiple files using regex.
Local File NameMANIPULATION_SOURCE_FILE_NAMEAlphanumericβœ…--Name of the target file on Shipyard. Can be regex if "Match Type" is set accordingly.
New Folder NameMANIPULATION_DESTINATION_FOLDER_NAMEAlphanumericβž–--Folder where the newly converted file(s) should be created on Shipyard. Leaving blank will place the file in the home directory. If the folder does not already exist, it will be created.
New File NameMANIPULATION_DESTINATION_FILE_NAMEAlphanumericβž–--What to name the newly converted files on Shipyard. If left blank, defaults to the original file name(s) with an updated extension based on the selected file format.

YAML​

Below is the YAML template for this Blueprint and can be used in the Fleet YAML Editor.

source:
blueprint: File Manipulation - Convert CSV
inputs:
MANIPULATION_DESTINATION_FILE_FORMAT: tsv ## REQUIRED
MANIPULATION_SOURCE_FOLDER_NAME: null
MANIPULATION_SOURCE_FILE_NAME_MATCH_TYPE: exact_match ## REQUIRED
MANIPULATION_SOURCE_FILE_NAME: null ## REQUIRED
MANIPULATION_DESTINATION_FOLDER_NAME: null
MANIPULATION_DESTINATION_FILE_NAME: null
type: BLUEPRINT
guardrails:
retry_count: 0
retry_wait: 0s
runtime_cutoff: 1h0m0s
exclude_exit_code_ranges:
- "0"