Orchestrate Data at Scale with our API
Product Updates

Orchestrate Data at Scale with our API

Steven Johnson
Steven Johnson

To create a perfect world of software interoperability, every SaaS product needs an API. In fact, we build most of our integrations with other software using their API.

We thought that it was a bit odd to use APIs from other vendors while still not having our own for Shipyard. We've been working hard over the past few month's to rectify this and today, we're joining the party.

Introducing the Shipyard API

We are excited to announce that Shipyard now has a public API that gives you the power to access information and perform actions that were never possible before.

Every user on a paid plan now has access to the Settings page which contains the ability to make an API Key. This key lets you access the API for all of the organizations you belong to.

Using the Shipyard API, you can now:

  • Create a new Fleet programmatically using YAML.
  • Retrieve the YAML for a specific Fleet.
  • Edit any Fleet's configuration.
  • Export a Fleet's historical logs.

These four features open up a world of possibilities for your data team. As always, you're still able to trigger your Fleets and pass data to them through our Webhooks.

Store your Workflow Configurations

# Export a Fleet's YAML definition:

curl https://api.app.shipyardapp.com/orgs/{org_id}/projects/{project_id}/fleets/{fleet_id} \
    --header "X-Shipyard-API-Key: {your_api_key}"

In Shipyard, it's easy to build 100s of Fleets, but it's just as easy to lose track of how everything is running, why it's running, and who is responsible for making updates. That's why we're now giving you the flexibility to export the YAML definitions for all of your Fleets so you can store them for safekeeping.

This feature enables you to:

  • Export all of your Fleet YAMLs to analyze their structure at a higher level.
  • Track which Blueprints are the most used in your organization.
  • Track historical versions of a Fleet in your own cloud storage.
  • Attach YAMLs to tickets in your project management tool.

Manage Data Workflows with Git

With the newfound ability to conveniently retrieve your Fleet's YAML, you can manage your YAML configurations in external version control services, like GitHub, Gitlab, or Bitbucket.

Let's take GitHub for example. In an ideal world, you want your team to manage all edits of a Fleet through a code review process. Now you can build out a system to map Shipyard's YAML files to a single repository. This encourages your team to edit the YAML in GitHub, go through a PR approval, and then upon merging the PR, run a GitHub action to update the YAML against Shipyard's UI. This entire process can empower your team to build and edit workflows without ever needing to access the Shipyard UI.

Build Templated Data Workflows on the Fly

# Create/Update a Fleet's using YAML:

curl https://api.app.shipyardapp.com/orgs/{org_id}/projects/{project_id}/fleets \
    -X PUT \
    --data -binary @{file.yaml} \
    --header "Content-type: application/yaml" \
    --header "X-Shipyard-API-Key: {your_api_key}"

A lot of business logic is repeatable. You perform the same task against the same services in a slightly different fashion for each of your customers.

Let's image your organization has a standard workflow to send your product data (that lives in Snowflake) to an SFTP for each of your customers. The only problem? Every Snowflake query needs to pull client specific data and every SFTP data dump needs to store the contents in separate client-restricted folders.

Since we added the ability to have YAML configuration files for Fleets, you can now store the configuration of this workflow as code. When you get a new customer, you can run a Python script to edit the template by inserting customer specific details. This edited YAML file can then be sent to the Shipyard API to create a new Fleet that will immediately be categorized, scheduled, and monitored.

With the API, you can generate as many Fleets as you want dynamically, making it much easier to manage your data workflows in bulk.

Run Data Workflows at Odd Hours

Ever wanted to run a workflow at unconventional and constantly changing times? Maybe you only want to run ads on rainy days. Maybe you want a reminder to walk to the dog before dusk every day. Or maybe you only want to scrape the web for last-minute tickets sales on the days that the Yankees play home games.

Whatever your rationale, these schedules are now possible through the ability to update a Fleet using the API. You can build a separate Fleet in Shipyard that will check daily for the logic condition then update another Fleet's YAML with the appropriate schedule. Alternatively, you could even run a Vessel at the end of a Fleet that updates the current Fleet's schedules!

Make Data Workflow Updates in Bulk

Let's imagine you've built alerts that download data from BigQuery and send the results to your team on Slack. Your data team recently went through an overhaul of how you build tables and views, so now your alerts need to reference tables that live in a different schema. While you could go through manually and make this update, you don't even know which of your 50+ alerts use the specific table that needs to be updated.

With the Shipyard API, making this change is a cinch.

Analyze Your Workflow Performance

# Export all voyages/logs in your organization:

curl https://api.app.shipyardapp.com/orgs/{org_id}/voyages \
    --header "X-Shipyard-API-Key: {your_api_key}"

For the first time, the Shipyard API will give you the ability to export information about the historical runs of your Fleets. This will give you access to all of the generated metadata, giving you an unprecedented ability to better understand your Fleet performance.

Here are just a few ideas of what you can do with this information:

  • Build a Tableau dashboard that displays the runtime and status of each Fleet run.
  • Monitor Fleet runtimes to see if they are creeping up over time.
  • Detect any anomalies where a Vessel or Fleet suddenly spikes or shrinks in runtime.
  • Analyze which Vessels in your Fleets error the most frequently so you can address larger issues.
  • Split up runtime costs between projects so you can verify which team initiatives are costing the most money.
  • Visualize a timeline of all the runs you have on Shipyard and how they overlap.

Get Started Today

These use cases only scratch the surface of what's now possible with Shipyard. Our API is now available to all users on our paid plans. Sign up for our Team Plan to start automating your data orchestration process with our API. Learn more about the API through our documentation.

We're looking forward to seeing how users will take advantage of this functionality to quickly launch, monitor, and share data pipelines!


About Shipyard:
Shipyard is a modern data orchestration platform for data engineers to easily connect tools, automate workflows, and build a solid data infrastructure from day one.

Shipyard offers low-code templates that are configured using a visual interface, replacing the need to write code to build data workflows while enabling data engineers to get their work into production faster. If a solution can’t be built with existing templates, engineers can always automate scripts in the language of their choice to bring any internal or external process into their workflows.

The Shipyard team has built data products for some of the largest brands in business and deeply understands the problems that come with scale. Observability and alerting are built into the Shipyard platform, ensuring that breakages are identified before being discovered downstream by business teams.

With a high level of concurrency and end-to-end encryption, Shipyard enables data teams to accomplish more without relying on other teams or worrying about infrastructure challenges, while also ensuring that business teams trust the data made available to them.

For more information, visit www.shipyardapp.com or get started with our free Developer Plan.