When it comes to building workflows, every team has wildly different use cases and needs.
Some data teams may only need to run a couple of scripts one after another. These workflows boil down to "Do this, then that". If something breaks along the way, it's ok to shut the entire process down and throw an error.
Some teams with large datasets may need their workflows to run on separate chunks of data in parallel. When all jobs have completed, they can move on to other steps in the process.
We've even seen teams with workflows that need to run "resolution logic" to appropriately fix subsets of data that have problems without disturbing the rest of the workflow.
Every day, we're finding new workflows that data teams are managing. With so many different types of connections and setups that could be built, it's important for the tool you select to be able to handle any use case that you throw at it, now or in the future.
Create Complex Branching Logic
Today, we're making it easier than ever to create complex workflows with the addition of conditional paths in all of your Fleets.
If you're not familiar with Fleets, they are Shipyard's intuitive visual way to building out data pipelines by connecting multiple Vessels (scripts) together. Every Vessel that's connected as a part of a Fleet can share data seamlessly downstream.
With the introduction of conditional paths, Shipyard now allows you to create workflows that are far more flexible and resilient to potential issues that may come up. These conditions can quickly be set and updated in a visual editor, eliminating the needing to create a complex DAG with code.
By default, when you connect Vessels in a Fleet, they will run one after the other if the upstream Vessels are Successful. With the introduction of conditional paths, teams will gain the ability to make branching logic where different sets of Vessels run when upstream Vessels error out or when they complete regardless of status.
As always, Vessels still have guardrails to run retries before settling on a final status. In addition, Vessels can still be connected with sequential, branching, or converging logic alongside the new conditional logic.
Error Handle like a Pro
Don't stop at just receiving an automatic error notification when a step in your pipeline fails. With the new Error path condition, you can build out a series of events that trigger when specific steps of your Fleet end up failing.
For example - Let's say you're running multiple subsets of dbt models and one of them fails. While your main "success branch" would send reports, run models, and perform other actions, your "error branch" could reload your data from Fivetran and send Slack updates to the relevant team or vendor, letting them know about the data issue.
Run Jobs No Matter the Result
Sometimes, you might not care about status. Rain or shine, you just want your scripts to run one after the other. That's where the new Completed path condition comes in.
You may want to build out a process that alerts teams to potential data issues with a workflow that immediately resolves the issue. Even if the message fails to send, you can still rest assured that the issue will get fixed.
You could also create your own file watcher to process new files being dumped on an external SFTP. Whether your watcher script finds a new file or not, you may still want it to still want it to log the attempt externally.
Conditional Paths are now available to all subscribers and can be tested with any trial account. Sign up for a free 14-day trial to get creating Fleets with complex logic based on a Vessel's status of Success, Errored, or Completed.
We're looking forward to seeing how users will take advantage of this functionality to quickly develop and deploy data solutions.
Shipyard is a modern data orchestration platform for data engineers to easily connect tools, automate workflows, and build a solid data infrastructure from day one.
Shipyard offers low-code templates that are configured using a visual interface, replacing the need to write code to build data workflows while enabling data engineers to get their work into production faster. If a solution can’t be built with existing templates, engineers can always automate scripts in the language of their choice to bring any internal or external process into their workflows.
The Shipyard team has built data products for some of the largest brands in business and deeply understands the problems that come with scale. Observability and alerting are built into the Shipyard platform, ensuring that breakages are identified before being discovered downstream by business teams.
With a high level of concurrency and end-to-end encryption, Shipyard enables data teams to accomplish more without relying on other teams or worrying about infrastructure challenges, while also ensuring that business teams trust the data made available to them.